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Preface 


Contact studies is a field of linguistics which has been the subject of increasing 
interest in the past few decades and the present volume is intended to reflect 
this interest by gathering together contributions by leading authors in the field. 
The volume deals with both individual cases of language contact and more 
general issues of the relationship of contact studies to other areas of linguistics. 
The individual studies are exemplary illustrations of a range of contact scenarios 
while the more general chapters deal with the interface of language contact with 
such areas as typology, language history, dialectology, sociolinguistics, and pidgin 
and creole studies. 

The genesis of this volume was marked by a fruitful collaboration between the 
editor and the colleagues who contributed. This congenial experience was unfor- 
tunately overshadowed by the death of one of the scholars in the project, Michael 
Noonan (1948-2009) of the University of Wisconsin, known affectionately as 
“Mickey” to his friends. His sudden departure was an unexpected and painful 
loss to all who knew him. 

The work on this project was greatly facilitated by the efficiency, professionalism, 
and helpfulness shown by the staff of Wiley-Blackwell, in particular by Danielle 
Descoteaux, Julia Kirk, and Anna Oxbury. To them I would like to express my 
sincere thanks for all that they have done in the production of this volume. 


Raymond Hickey 


Language Contact: 
Reconsideration and 
Reassessment 


RAYMOND HICKEY 


The most cursory glance at linguistic publications in the past few decades reveals 
a wealth of literature on language contact: articles, monographs, edited volumes, 
special issues of journals (see the references in the literature section to this 
chapter).’ It is perhaps true to say that one of the major impulses for research in 
the past two decades must surely have been the publication of Sandra Thomason 
and Terrence Kaufman’s large-scale study of various contact scenarios with many 
generalizations about the nature of contact and the range of its possible effects 
(Thomason & Kaufman 1988). Due to the carefully mounted cases and several 
stringent analyses, this study led to the re-invigorization of language contact 
studies and the re-valorization of language contact as a research area. As well 
as highlighting the field of language contact within linguistics, the study also 
allowed for virtually any type of change as a result of language contact, given 
appropriate circumstances to trigger this. 

Contact studies from the 1960s and 1970s are not anything like as copious as 
in the ensuing decades. There are reasons for this. While the classic study of 
language contact by Uriel Weinreich was published in 1953, the following two 
decades were years which saw not just the heyday of early generative linguistics 
but also the rise of sociolinguistics, and it was those two directions in linguistics 
which were to dominate the research activity of scholars for a number of decades. 

Language contact was at the center of work by scholars somewhat outside 
the mainstream. Smaller departments at universities, dealing with non-Indo- 
European languages or Indo-European ones apart from the Germanic and 
Romance languages, often produced research in which contact was pivotal. But 
for scholars in the English-speaking world, or dealing with varieties of English, 
language contact was not a primary concern during the 1960s and 1970s. Apart 
from the dominance of other approaches to linguistics at this time, there were 
further reasons for the relative neglect of language contact. Older literature 
which looked at contact tended to assume uncritically that contact was always 
the source of new features registered in particular languages, assuming the pres- 
ence of at least two in any given scenario. Furthermore, early studies did not 
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necessarily provide rigorous taxonomies for the various types of language con- 
tact and their effects (though Weinreich is a laudable exception in this respect). 
Nor did they usually distinguish individual tokens of language contact from the 
contact of language systems and the indirect effects which the latter situation 
could have. 

Overviews of aspects of language which also touched on contact did of course 
have relevant chapters, e.g. that by Moravcsik (1978) in the Greenberg volumes 
on language universals. And the early 1980s did see studies of language contact, 
e.g. Heath (1984), but other suggestions for the triggers of language change were 
preferred, at least in mainstream language studies, such as varieties of English, 
see Harris (1984), an influential article arguing against the role of contact in the 
rise of varieties of English in Ireland. 


1 Recent Studies of Language Contact 


The stimulus provided by Thomason and Kaufman (1988) is in evidence, directly 
or indirectly, in the many publications which appeared during the 1990s and into 
the 2000s. Some of these are in a more traditional style, e.g. Ureland and Broderick 
(1991), but others show a linguistically nuanced analysis of the effects of contact, 
see the contributions in Fisiak (1995) and Thomason (1997b), along with the 
typological overview in Thomason (1997a). Indeed these publications often con- 
tain a blend of contact studies and a further approach in linguistics, consider the 
sociolinguistically based investigation of language contact in Japan by Loveday 
(1996) or the large-scale typological studies in Dutton and Tryon (1994). 

The 2000s opened with a number of analyses of different contact scenarios. 
There is the general overview of language contact and change by Frans van 
Coetsem (van Coetsem 2000) along with the overview article by Thomason 
(2000), the study of contact within the context of the Slavic languages” by Gilbers, 
Nerbonne, and Schaeken (2000) and the investigation of lexical change due to 
contact in King (2000),’ to mention just three of the publications from this year. 

2001 saw the publication of Sarah Thomason’s introduction to language con- 
tact (Thomason 2001) and of a volume on language contact and the history of 
English (Kastovsky and Mettinger 2001), as well as the overview of features in 
English-lexicon contact languages (pidgins and creoles) by Baker and Huber 
(2001). The latter type of investigation characterizes volumes such as that by 
McWhorter (2000), the full-length study by Migge (2003), the edited volume by 
Escure and Schwegler (2004), as well as the special journal issue by Clements and 
Gooden (2009). 

Clyne (2003) is a monograph which examined language contact between 
English and immigrant languages in Australia. This type of contact is grounded 
in bilingualism, an avenue of research which has been pursued in recent years, 
see Myers-Scotton (2002) as a representative example. Further studies concern 
other kinds of contact-based varieties of English far from the European context, 
e.g. Chinese Englishes, see Bolton (2003). 
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Language contact, linguistic areas, and typology 


Research into language families and linguistic areas received considerable impetus 
during the 2000s. The native languages of northern South America were scrutin- 
ized in Aikhenvald (2002a, 2002b). This vein of investigation was continued 
with Aikhenvald and Dixon (2006). Johanson (2002) looked at structural change 
in the Turkic languages which can be traced to contact (see Johanson, this volume, 
as well). Similar studies from the early 2000s, e.g. Haspelmath (2001), attest to 
this revitalized interest in the study of linguistic areas (Matras, McMahon, & Vincent, 
2006). 

Language typology and its connection with language contact is a theme in 
studies which congregate around families and areas, see the contributions in 
Haspelmath et al. (2001), Dahl and Koptjevskaja-Tamm (2001), Aikhenvald and 
Dixon (2006), and also in association with the issue of language development 
and complexity, see the chapters in Miestamo, Sinnemdki, and Karlsson (2008) 
and the study by Mufwene (2008). 

Furthermore, there are languages whose entire development and history is 
dominated by contact with other languages: Romani and Yiddish are good 
examples of this situation, see Matras (1995; 2002) and Jacobs (2005) on these two 
languages respectively. 

Several studies of contact have stretched backwards to reach greater time depth 
using the tools of contemporary linguistics.* Ross (2003) is an example of this in 
his investigation of prehistoric language contact. Salmons and Joseph (1998) look 
at the evidence for and against Nostratic, an undertaking in which contact is 
center-stage. For contact and early Finno-Ugric, see Laakso (this volume) and for 
contact and Arabic, see Versteegh (this volume). 

The investigation of languages which have virtually no written records presents 
a special set of problems. This is particularly true of native American languages 
(Mithun, this volume), of African languages (Childs, this volume), of Australian 
languages (McConvell, this volume) and of languages in New Guinea (Foley, this 
volume). 


Language contact and mixed languages 


Not unrelated to this type of situation is that of mixed languages, the result not 
just of contact but of fusion, to which the attention of the scholarly community 
was drawn by a number of seminal publications, among the earliest of which 
was Muysken (1981) which presented the case of Media Lengua, a mixture of 
Quechua and Spanish (see Muysken 1997 for a later overview). A broader 
perspective was provided by the collection of studies on a number of mixed 
languages to be found in Bakker and Mous (1994). Cases of mixed languages 
have also been reported in language endangerment situations, e.g. that of light 
Warlpiri in Northern Australia (O’Shannessy 2005). An instance of a mixed lan- 
guage from the Slavic area would be Surzhyk, a blend of Russian and Ukrainian, 
see Grenoble (this volume). A further example is Trasianka (a blend of Belarussian 
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and Russian). The Romance languages also have similar mixtures which arose 
due to contact, e.g. that between Portuguese and Spanish in the border areas of 
Brazil and Uruguay, see remarks by Lipski (this volume) on portunhol/fronterizo. 


Language contact, obsolescence, and death 


Language obsolescence (Dorian 1989) and language death (Nettle & Romaine 
2000; Romaine, this volume; Harrison 2007) are further issues closely related to 
language contact. After all, the endangerment of a language always goes hand in 
hand with contact with one or more dominant languages, the latter threatening 
the continuing existence of the minority language, or indeed in many cases lead- 
ing to its disappearance. 


Language contact and grammaticalization 


The study of grammaticalization received significant impulses from the research 
of Elizabeth Traugott, Bernd Heine, and Paul Hopper in a number of landmark 
publications, such as Traugott and Heine (2001), as well as the accessible text- 
book, Hopper and Traugott (2003 [1993]). In the context of the present volume 
the focus on grammaticalization and language contact’ was made in the pro- 
grammatic article by Heine and Kuteva (2003) which was followed up by the 
full-length study Heine and Kuteva (2005), see Heine and Kuteva (this volume), 
as well. 


Language contact and older hypotheses 


The assessment of language contact in the history of established languages is a 
matter which has varied in the relevant scholarship. For the history of English it 
is clear that the influence of other languages — bar Latin, Old Norse, and Anglo- 
Norman — has been played down by the majority of scholars in the field.° But in 
recent years, a reexamination and reassessment of the role of contact in the devel- 
opment of the Germanic dialects in the period subsequent to the transportation 
to England has taken place. Specifically, the role of British Celtic in this context 
has been highlighted by publications such as Filppula, Klemola, and Pitkanen 
(2002), Filppula, Klemola, and Paulasto (2008), and Hickey (1995b), re-connecting 
to an older hypothesis put forward by German and Scandinavian scholars in the 
first half of the twentieth century, see Preufler (1938), Dal (1952), and Braaten 
(1967). Contact as a source of change has been further extended to encompass 
later, nonstandard features of English such as the so-called Northern Subject 
Rule, see Klemola (2000). For details on the “Celtic hypothesis” in the history of 
English, see Filppula (this volume). 


Language and/or dialect contact 


It is obvious that the difference between language contact and dialect contact is 
more one of degree than of kind. The interaction of dialects with one another is 
a topic which received considerable impetus from Peter Trudgill’s 1986 study 
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Dialects in Contact after which the treatment of this subject was seen as on a 
par with that of languages in contact. Given the great diversity of varieties of 
English, this approach proved to be fruitful in the anglophone world and has 
been adopted by many scholars since, especially by considering the notion of 
accommodation together with existing data not hitherto analyszed from this 
perspective. Dialects in contact are treated in this volume in the contributions 
by David Britain and Paul Kerswill (in the context of new varieties) as well as 
Joseph Salmons and Thomas Purnell (in the context of American English). 


Language contact in English studies 


In English studies the significance of contact in the rise of nonstandard vernaculars 
was given increasing recognition during the 1980s. Rickford (1986) is a well-known 
example of work in this vein, here with specific reference to dialect transporta- 
tion and contact at overseas locations. However, not all scholars saw contact as 
a prime source of new features in varieties, some put more emphasis on the 
continuation of vernacular traits at new locations. This stance forms the so-called 
retentionist hypothesis which enjoyed greatest favor among Anglicists; a key 
article for this view is Harris (1984). However, by the late 1980s and into the early 
1990s, the considered case for contact in certain scenarios regained acceptance 
and was underlined by key publications such as Mesthrie (1992) which showed 
clearly the role contact played in the rise of South African Indian English. The 
dichotomy of contact versus retention continued to occupy scholars into the 2000s, 
see Filppula (2003) which provides a fresh look at the arguments. The role of 
contact in the formation of different varieties of English at various geographical 
locations has been considered, e.g. Bao (2005) which examines substratist influ- 
ence on the aspectual system of English in Singapore. For contact and African 
Englishes, see Mesthrie (this volume) and for Asian Englishes, see Ansaldo (this 
volume). 


Vernacular universals and contact 


The notion of vernacular universals is something which has been dealt with by 
Anglicists in recent years, above all by Jack Chambers (see Chambers 2004). It 
refers to features found across varieties of English in different parts of the world 
and postulates that the occurrence of such features is due to universals of lan- 
guage development, specifically in the context of new dialect formation (see Gold 
2009, for example). The issue has spawned a number of publications the most 
comprehensive of which is the volume by Filppula, Klemola, and Paulasto (2009b) 
in which vernacular universals are viewed within the framework of language 
contact, see the introduction to that volume (Filppula et al. 2009a) and also the 
contribution by Donald Winford (Winford 2009). 


Sociolinguistic perspectives on language contact 


An emphasis on the social setting in which language contact can take place 
is found in many publications, e.g. those in Potowski and Cameron (2007) on 
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Spanish and contact and in particular in studies of pidgins and creoles (Deumert 
and Durlemann 2006; Holm, this volume; Schneider, this volume). Studies 
like Siegel (1987), where the plantation environment of the Fiji Islands in the 
nineteenth century is investigated, implicitly adopt this stance. The role of 
substrate in the rise of these contact languages has also been pursued in other 
publications by Siegel (1999, 2000a, 2000b, 2008, this volume). In a far-eastern 
context this issue has also being broached, see the discussion in Matthews (this 
volume). 

In handbooks on sociolinguistics and models of socially determined language 
change, chapters on contact can also be found, e.g. Sankoff (2002) in the Handbook 
of Language Variation and Change (Chambers, Trudgill, & Schilling-Estes 2002). 

A broader view than just the social setting can be found in considerations of a 
language’s ecology, see Mufwene (2001, 2007) and the discussion in Ansaldo 
(this volume). 


Contact in urban environments 


In the past, contact studies did not usually deal with the rural-urban dichotomy, 
probably because at the time at which the contact is assumed to have taken place 
this division was not relevant for the communities in question. However, con- 
temporary investigations of contact, either interlinguistic or intralinguistic, are 
frequently of urban scenarios, e.g. Silva-Corvalan’s 1994 study of Spanish and 
English in Los Angeles or Hickey’s 2005 study of language variation and change 
in Dublin, where dissociation (Hickey 2000), triggered by internal contact 
between differing varieties in the city, has been the driving factor. Other urban 
environments have provided further examples of change and development 
through contact, e.g. the creative language mixture found in the Sheng and 
Engsh codes in urban Kenya (Abdulaziz & Osinde 1997). 


Overviews of language contact 


The increase in the data’ on language contact® has led to more general reflections 
on the nature of contact and its effects. This is something which can be observed 
in other fields as well. Once most of the groundwork has been done and bodies 
of data have been collected, scholars begin to reflect on the status of the field as 
a scholarly endeavor. It is in this light that one can view publications like those 
by Donald Winford, e.g. Winford (2005, 2008), and indeed the chapters in the 
first three sections of the present volume, “Contact and Linguistics,” “Contact 
and Change,” and “Contact and Society” respectively. 

A further sign of the maturity of a field is the publication of handbooks 
dedicated to it. This shows that it has become sufficiently mainstream for it to 
appear in dedicated courses at universities and hence to be worthy of handbook 
treatment. The readiness of publishers to accept such volumes is evidenced by 
the handbooks by GoebI et al. (1996), Donald Winford (Winford 2003) and Yaron 
Matras (Matras 2009) and indeed by the present volume itself. 
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Lastly, one can mention the center-stage treatment of language contact 
accorded in handbooks of historical linguistics, such as McMahon (1994), 
McColl-Millar (2007) and Campbell (2004). 


2 Generalizations Concerning Contact 


It would seem that language contact always induces change. History does not 
provide instances of speech communities which adjoined one another, still less 
which intermingled, and where the languages of each community remained un- 
affected by the contact.’ However, there may well be a difference in the degree to 
which languages in contact influence each other, that is a cline of contact is often 
observable, indeed to the extent that the influence is almost totally unidirectional. 
Furthermore, influence may vary by level of language and depend on the nature 
of the contact, especially on whether bilingualism exists or not and to what degree 
and for what duration (see the discussion in Muysken, this volume). 


Internal versus external reasons 


It is scholarly practice to distinguish between internal and external reasons for 
language change (Hickey 2002b). Internal change is that which occurs within a 
speech community, generally among monolingual speakers, and external change 
is that which is induced by contact with speakers of a different language. 

Opinions are divided on when to assume contact as the source of change. 
Some authors insist on the primacy of internal factors (e.g. Lass & Wright 1986) 
and so favor these when the scales of probability are not biased in either an 
internal or external direction for any instance of change. Other scholars view 
external reasons more favorably (Vennemann 2001, 2002b, this volume) while 
still others would like to see a less dichotomous view of internal versus external 
factors in change (Dorian 1993; Jones & Esch, 2002). The role of contact in the 
diversification of languages is also a theme in the seminal monograph by 
Johanna Nichols (1992), a theme which is taken up in her contribution to the 
present volume. 


Substrate and superstrate 


A lot of attention has been paid in the literature to the relative social status of two 
languages in contact situations. Two established terms are used to label the lan- 
guage with less status and that with more, namely, “substrate” and “superstrate” 
respectively. The superstrate is regarded as having, or having had, more prestige 
in the society in which it is spoken, though just precisely what “prestige” refers 
to is something which linguists like James Milroy have questioned. Nonetheless, 
there would seem to be a valid sense in which one of two languages has, or had, 
more power in a contact situation. Asymmetrical levels of power in a contact 
situation play a definite role in the results of contact. 
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Relative status and direction of influence 


The standard wisdom has traditionally been that the language with more status 
influences that with less, i.e. borrowing is from the superstrate by the substrate. 
This is, however, a simplistic view of the possibilities of influence in a contact 
scenario. Vocabulary, as an open class with a high degree of awareness by speak- 
ers, is the primary source of borrowing from the superstrate. Again French and 
Latin in the history of English are standard examples. 

However, if contact persists over many generations, then the substrate can 
have a gradual and imperceptible influence on the superstrate, leading in some 
cases to systemic change at a later time. This type of contact can be termed 
“delayed effect contact” (Hickey 2001) and may well be the source of syntactic 
features in English which the latter has in common with Celtic (Poussa 1990; 
Vennemann 2002a; Isaac 2003). This line of thought is pursued by Filppula (this 
volume), who presents the arguments for Celtic influence on English. In addition 
to structural parallels there is further evidence here. Consider the fact that in Old 
English wealh was the word for ‘foreigner’ but also for ‘Celt’. The word came to 
be used in the sense of ‘servant, slave’ (cf. wielen ‘female slave, servant’ with the 
same root, Holthausen 1974: 393), which would appear to be an indication of the 
status of the Celts vis 4 vis the Germanic settlers.” Not only that, the meaning of 
‘servant’ implies that the Germanic settlers put the subjugated Celts to work for 
them; this in turn meant that there would have been considerable face-to-face 
contact between Celts and Germanic settlers, in particular between the children 
of both groups. As the latter context was one of first language acquisition it 
provided an osmotic interface for structural features of Celtic to diffuse into Old 
English. Given that written Old English was dominated by the West Saxon stand- 
ard, it is only in the Middle English period that the syntactic influence of Celtic 
becomes apparent in the written record, e.g. in the appearance of possessive 
pronouns in cases of inalienable possession. 


Where does it start? The locus of contact 


It is a convenient shorthand to claim, for example, that language A borrowed 
from language B. However, this is already an abstraction as the appearance of 
borrowings in a speech community can only be the result of actions by indi- 
vidual members of this community. If one puts aside cases of “cultural” borrow- 
ings, e.g. from Latin or Greek into later European languages or from English 
into other modern languages, then it is probably true that the borrowing of 
“systemic” material — inflections, grammatical forms, sentence structures — can 
only occur via bilinguals. This view has a considerable tradition. Weinreich (1953) 
saw the true locus of contact-induced change in the bilingual individual who 
moves between two linguistic systems. Some scholars go further and consider 
bilinguals as having a single system, e.g. Matras (this volume) who contends that 
bilinguals “do not, in fact, organize their communication in the form of two 
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‘languages’ or ‘linguistic systems’.” The awareness of linguistic systems on the 
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part of speakers is a difficult issue to resolve. It may well be that in prehistory 
and in nonliterate societies today the awareness of the separateness of languages 
was/is less than in present-day literate societies. If one of the languages a bilingual 
uses is the sole language of a country then the bilingual’s awareness of switching 
between languages increases. Matras (this volume) maintains that bilinguals 
“operate on the basis of established associations between a subset of structures 
and a set of interaction contexts.” The communicative competence of the bilingual 
then includes making the appropriate choices of structures for communication in 
given contexts. Whatever the degree of awareness by bilinguals of the separateness 
of their linguistic (sub)systems, the presence of competence in two languages fulfils 
the precondition for the adoption of material from one language into another. 
The next, and crucial, question is how borrowings, made on an individual level, 
spread throughout a community and are accepted by it. This step is essential for 
borrowings/items of transfer to become part of a language/variety as a whole and 
hence be passed on to later generations as established features. This issue will be 
addressed in Chapter 7 “Contact and Language Shift” below. 


What can be attributed to language contact? 


The current volume is dedicated to analyses of language contact, the situations 
in which it is or was to be found, and the results it engenders or has engendered. 
This focus should not imply a neglect of changes, indeed types of change, which 
are not due to language contact. Consider for instance, reanalysis by language 
learners. A specific instance of this is provided in the prehistory of Irish. The 
precursors of all the Celtic languages inherited complex suffixal inflections from 
still earlier stages of Indo-European when these were central to morphology. 
Some time before the Celtic languages appeared in writing (in the first centuries 
Bc) the languages changed their typology. They began to abandon suffixal inflec- 
tions as a means of indicating grammatical categories and adopted a new system 
whereby these categories were indicated by changes to the initial segments of 
lexical words, so-called initial mutation. This typological shift came about by 
children reanalyzing phonetic changes at the beginnings of words (external sandhi) 
as having systemic status (for a fuller discussion, see Hickey 1995c, 2003a). This 
is an entirely language-internal change, though the original trigger for the phon- 
etic changes, which were later reanalyzed, may have been due to contact. 


Pushing the question back 


Contact treatments tend to push the question of origin back a step but do not 
necessarily explain how a phenomenon arose in the first place. For instance, if 
one believes that the VSO word order of Insular Celtic (Eska, this volume) is due 
to contact with a Semitic language (Pokorny 1949) present in the British Isles 
before the arrival of the Celts, one still has not accounted for the rise of VSO in 
the source language." Thus contact differs from explanatory models of language 
in that it offers more or less plausible accounts for the appearance of linguistic 
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features. However, the explanation of contact mechanisms and speaker strategies 
in contact situations can indeed have explanatory value. 


The history of contact phenomena 


It can be salutary to bear the attested history of contact phenomena in mind. The 
paths of contact may be multiplex and varied. Take, for instance, the immediate 
perfective of Irish English which is (rightly) regarded as a calque on Irish. 


(1) Ta sé tar éis an ghloine a briseadh. 
[is he after the glass COMP break-NONFINITE] 
‘He is after breaking the glass.’ 


Both the Irish and the Irish English structure have gone through historical develop- 
ments while in contact. Originally, the structure could be used in both Irish 
and Irish English with future, i.e. prospective, reference and it is attested from 
the seventeenth and eighteenth centuries in this sense. However in both lan- 
guages, the prospective use declined and an exclusively past, i.e. retrospective, 
use came to the fore, gradually replacing the former one in the latter half of the 
nineteenth century (McCafferty 2004). 


Contact in hindsight 


If centuries lie between the period of contact and the present it may be difficult to 
reconstruct the social circumstances of the contact. However, the nature of the 
contact can often be gleaned from the results it engendered. To illustrate this 
consider lexical changes in the period immediately after the coming of the Anglo- 
Normans to Ireland in the late twelfth century. Many loans from Anglo-Norman 
appear and not a few of them are “core” vocabulary items like the words for 
‘boy’ (garstin < Anglo-Norman garcon) and ‘child’ (pdiste < Anglo-Norman page). 
Given that Anglo-Norman was the superstrate language in the late Middle Irish 
period, why should the Irish have borrowed such “noncultural” core items as 
‘boy’ or ‘child’? The answer would seem to lie in the manner in which these 
words entered Irish. Assume that they were not borrowed by the native Irish 
directly, but rather that the Anglo-Normans used them in their variety of Irish. 
It is a historical fact that the Anglo-Normans lived in the countryside among the 
Irish and gradually shifted to their language. During the shift period an inter- 
mediate variety was spoken by the Anglo-Normans in which they used words from 
their own language like garcon and page. Because of the power the Anglo-Normans 
had in Irish society, the native Irish adopted core vocabulary items of this Anglo- 
Norman variety of Irish and, for example, the negation structure Nil puinn Gaeilge 
agam [is-not point Irish at-me] ‘I cannot speak Irish’, which shows the negative 
use of French point (Rockel 1989: 59). The likelihood of this scenario is strengthened 
by considering that the Anglo-Norman loans in Irish did not necessarily replace 
the native Irish words. For instance, the Anglo-Norman loan péiste exists side by 
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side with the original Irish word leanbh ‘child’. For ‘boy’ Irish has two words: garstin 
and the original buachaill. 

To use a metaphorical term, the Anglo-Normans “imposed” parts of their 
second-language variety of the Irish language on those who surrounded them in 
late medieval rural Irish society. 


Category and exponence 


In borrowing one must distinguish between systemic and nonsystemic elements. 
The latter are typically individual words or phrases, pragmatic markers, sentence 
adverbials, or other free-floating elements which are not part of the grammatical 
structure of a language. Such elements travel well because they do not require 
integration into the system of the borrowing language and can be picked up by 
adults in a contact situation. Indeed they often migrate from one language to 
another via code-switching (Muysken, 2000; Gardner-Chloros, this volume). 
In Irish, for instance, English well, just, really have been borrowed and are used 
continually by native speakers, although the grammatical structures of Irish and 
English are very different. 

When looking at systemic elements, one must bear an essential distinction in 
mind, namely, that between a grammatical category and the exponence of this 
category. The reason the distinction is essential in contact studies is that some 
languages borrow a grammatical category but not the manner of expressing it in 
the source language. Indeed it may be true that adopting a category rather than 
its exponence is not so much a feature of borrowing but of transfer in language 
shift (see the discussion of the borrowing of structural elements in Winford, this 
volume). 


Transfer in language shift 


The term “borrowing” implies that speakers, adopting some element or category 
from a source language, do not switch to the latter. The Middle English borrow- 
ings from French are an example of this. However, many contact situations 
involve language shift. When viewing the past few centuries in Ireland one 
can see that the majority of the population was originally Irish-speaking and that 
they gradually transferred to English, particularly during the nineteenth century. 
There was no general schooling for the Irish before the 1830s so that the native 
Irish learned English by picking it up — in adulthood - from others who had a 
somewhat better knowledge of the language. This is a situation of unguided 
adult second language acquisition. Here the transfer of features from the outset 
language (Irish) to the new language (English) was at a premium. An obvious 
example of this is accent: initially, adult learners of a second language use the 
phonetic realizations of phonological units from their first language when speak- 
ing the second one.” This can still be recognized in rural western forms of Irish 
English where the phonetic realization of /ai, au, 01, ut, A/ is the same in English 
as it is in Irish. 
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The search for categorial equivalence 


On a grammatical level one finds similar behavior: speakers involved in lan- 
guage shift transfer categories from their native language to the new one they are 
moving to. But what if the new language does not have an automatic equivalent 
to a grammatical category in the outset language? This triggers a search for 
categorial equivalence. Take, for instance, the habitual aspect of Irish. This has no 
formal parallel in English so that speakers during language shift would not have 
found a ready equivalent to it. Instead what happened is that the nonlexical verb 
do was co-opted to express an habitual in Irish English, as can be seen in the 
following example: 


(2) Bionn si ag déanamh imni faoi na leanai. 
[is-HABITUAL she at doing worry about the children] 
‘She does be worrying about the children.’ (vernacular Irish English) 


What one can say here is that the category “habitual aspect” was transferred 
during the language shift process from Irish to Irish English. The exponence 
which was chosen derives, however, from the co-option of English do to express 
this category. The verb do is suitable for the expression of habitual aspect as 
it denotes the carrying out of an action. This means that a construction of the 
kind “X does be Y-ing” had a high probability of diffusion and acceptance in the 
community of new English speakers which arose in Ireland in the early modern 
period. 

Recall, furthermore, that for the vast majority of Irish speakers, the language 
shift took place in a nonprescriptive environment, one in which creativity was 
not restricted by notions of correctness. This meant that a degree of restructuring 
occurred in English which is normally only found in creolization scenarios (Hickey 
1997). 


Neglect of distinctions in language shift 


Just as speakers search for equivalents in the target language to categories of 
their native language during language shift, so they also neglect distinctions in 
the target language which are not found in their native language. This neglect 
may become established if the transfer variety of the target language stabilizes 
and becomes focused, as has been the case with Irish English. 

The relative infrequency of the present perfect in Irish English has been 
remarked on by many authors since the nineteenth century (Hickey 2007: 142-5). 
As this verbal category does not exist in Irish one can surmise that it was 
neglected by learners of English in the period of language shift. In Irish English 
actions which began in the past and continue into the present, or which are 
relevant to the present, are expressed by the simple present or past, whichever 
is appropriate. 
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(3) a. I know her since a long time now. 
b. Ta aithne agam uirthi le tamall fada anois. 
[is knowledge at-me on-her with time long now] 
He’s married for ten years. 
d. Ta sé posta le deich mbliana anuas. 
[is he married with ten years down] 


2) 


What does not get transferred in language contact? 


Not only is there a neglect of grammatical distinctions during contact with shift, 
as just outlined, but there are certain categories which appear never to be trans- 
ferred. Consider the following list which shows features which do not occur in 
contact varieties of Irish English where one would expect the greatest influence 
of Irish grammatical structures (Hickey 2007: 142). 


1 VSO word order 


Dhiol mé mo theach. ‘I sold my house.’, lit. ‘sold I my house’ 
2  Post-posed adjectives 
an fear bocht ‘the poor man’, lit. ‘the man poor’ 
3 Post-posed genitives 
teach Shedin ‘John’s house’, lit. ‘house John-GEN’ 
4 Absence of personal pronoun in present tense (pro-drop) 
Ni thuigim tada. ‘I don’t understand anything.’, lit. ‘not understand- 


1ST_PERS_SG anything’ 
5 Autonomous verb form 


Rinneadh an obair. ‘The work was done.’, lit. ‘done-was the work’ 
Rugadh mac di. ‘She bore a son.’, lit. ‘born-was a son to-her’ 
6 Possessive pronoun and “verbal noun” 
Bhi sé a bhagairt. ‘He was threatening him.’ /’He was threatening 
(to do) it.’, lit. ‘was he at-his threatening’ 
Bhi sé 4 bagairt. ‘He was threatening her.’, lit. ‘was he at-her 
threatening’ 


The absence of some of these features can be accounted for by the lack of typo- 
logical fit between the two languages. For instance, item (5) was rendered via 
the English passive and item (6) was expressed using a simple direct object. But 
items (1)—(4) seem to be of a different nature (see Corrigan this volume for fur- 
ther discussion). They would appear to derive in Irish from parameter settings: 
(1)-(3) are dependent on the setting for direction of modification and (4) for that 
of the pro-drop parameter. Irish has post-modification (VSO, N + Adj, N + Gen) 
and a positive setting for pro-drop. In English the reverse is the case. It would 
appear that Irish speakers switching to English intuitively recognized this and 
never used the Irish values in their shift variety of English. 
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Nonbinary categories in contact 


The results of contact are not always obvious at first sight. Where something is 
borrowed — a sound, a word, or a structure — which hitherto did not exist in the 
borrowing language, the matter is fairly clear. But what of the following situ- 
ation? A language has a certain grammatical device X and so does another lan- 
guage with which it is in contact. However, X in the second language has a much 
greater range of applications that in the first language. In the course of time, the 
first language expands the range of contexts in which it can use X. Here is a 
concrete example. Irish allows the fronting of elements of a sentence in order to 
topicalize them. The process involves clefting, that is a dummy verb forms a 
main clause with the topicalized element and the remainder of the sentence is 
contained in a following relative clause. 


(4) a. Is go Gaillimh ata sé imithe. 
[is to Galway that-is he gone] 
‘It’s to Galway he’s gone.’ 
b. Is mor le Maire ata a mac. 
[is great with Mary that-is her son] 
‘It’s friendly with Mary her son is.’ 


This example shows the difficulty of deciding whether contact is the source of a 
nonbinary category, i.e. one of degree and not of presence versus absence. Indeed 
such instances highlight the nature of linguistic argumentation in contact scenarios 
where one operates with statements of probability but nothing more concrete. In 
the final analysis it is a question of individual preference just how much weight 
one accords contact accounts. 


Permeability of linguistic systems 


There is nothing in the structure of a language which is excluded from borrowing / 
transfer through contact. Given sufficient intensity and duration, all linguistic 
subsystems can be affected, even the core morphology. Nonetheless, there are 
areas of language which show much greater movement in a contact situation. 
Single words and phrases as well as pragmatic markers and sentence adverbials 
are borrowed easily (Matras 1998). The reason is clear: such elements do not 
require integration into the grammatical system of the borrowing language and 
can be accommodated without any degree of restructuring. In a language shift 
situation syntactic variation can occur as a result of transfer during the shift 
phase, often due to the development of alternative strategies to reach equivalents 
to grammatical categories and structures of the outset language, e.g. relative 
clauses in South African Indian English (Mesthrie & Dunne 1990) whose 
speakers have South Asian heritage languages. Among the other motivations for 
borrowing /transfer are (1) the resolution of ambiguity and (2) the filling of gaps 
in paradigms for which the following examples can be given. 
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Borrowing to resolve ambiguity 


A well-known instance where a borrowing helped to resolve internal ambiguity 
in the morphology of a language — or at the very least potential ambiguity — is 
provided by the third person plural pronominal forms in th- (they, them, their) 
which derive from Old Norse during the Old English period. The position before 
the borrowing was such that both the third person singular and plural pronom- 
inal forms began in h- and the difference between the third person masculine 
singular and the plural was that between /e:/ and /i:/ (hé, ht). Borrowing the 
Old Norse forms thus increased the phonetic distinctiveness of the singular and 
plural forms, though it also produced suppletion in this morphological paradigm 
(Werner 1991). 


Borrowing to fill gaps in paradigms 


There have been many cases where a gap in a morphological paradigm has been 
filled by a borrowing from a further language. The borrowing can take the form 
of a direct loan from a language or of a lexical creation made by scholars, e.g. the 
Latinate adjectives of English such as marine (noun: sea), aquatic (noun: water), 
equestrian (noun: horse), which appeared in the early modern period. 
Borrowing, or creation on the basis of external models, does not have to be the 
path taken. Consider, for instance, the paradigmatic pressure which arose in 
English with the demise of a clear distinction between the singular and plural of 
second person pronouns. This issue was resolved language-internally for the 
majority of vernacular varieties which now show a pronominal distinction 
between the singular and plural second person, e.g. y‘all, y’uns, youse, yez for ‘you’- 
PLURAL. But in the case of Caribbean Englishes, the form unu — a borrowing 
from input West African languages to the area — filled the gap (Hickey 2003b). 


Convergence scenarios 


There are cases where a certain instance of change could derive from both an 
internal development, often “tension” in the language system, and the influence 
of another language (Hickey 2002c). A good example is provided by the develop- 
ment of stress patterns in the dialects of Irish. Briefly, the situation is as follows: 
Old Irish (600-900) had lexical root stress but in Middle Irish (900-1200) long 
vowels developed through vocalization of voiced fricatives in non-initial position. 
This led to tension with initial short vowels and long vowels in later syllables. 
The three major dialect areas reacted differently to this tension. 


(5) Long vowels in unstressed syllables 
North (Ulster) Post-initial vowels are shortened 
West (Connacht) Maintains the original stress pattern, possibly with 
syncope of the initial short vowel in some cases 
South (Munster) Stress is shifted to post-initial long vowels 
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scadan ‘herring’ 

North V.VV > 'V.V /'skadan/ 
West 'V.VV > 'V.VV /'skudain/ 
South V.VV > V.'VV /ska'da:n/ 


The southern shift of stress to non-initial long vowels might look like a purely 
internal solution to the tension between quantity and stress which existed prior 
to this. However, the south is the region where the influence of Anglo-Norman, 
which had non-initial stress for long vowels, was greatest. In addition it is known 
that Anglo-Norman affected varieties of English in the south-east of Ireland, 
inducing stress on non-initial long vowels. One can say of the development in 
the south of Ireland that it is a typical convergence scenario: there are cogent argu- 
ments for both an internal and external motivation for the stress change. In the 
absence of any clinching evidence either way, one maintains that both sources 
are possible, i.e. both “converged” to produce the observed output (see further 
the discussion of convergence in Matras, this volume). 


Internal developments which favor borrowing 


A nuanced view of the convergence scenario would ask whether internal changes 
can render a language more susceptible to borrowing. Recall that English is 
unique among the Germanic languages (Roberge, this volume) in requiring 
possessive pronouns in instances of inalienable possession.” The inherited Old 
English type involved a personal dative as seen still in modern German, e.g. Er 
legte ihm die Krone aufs Haupt [he laid him-DATIVE the crown on the head] ‘He 
laid the crown on his head.’ This type of structure disappears in early Middle 
English and is replaced by one in which the possessive pronoun is used (see 
gloss). Now the Celtic languages were, and still are, remarkable in demanding 
the use of a possessive pronoun for inalienable possession. Could British Celtic 
of the Old English period have provided the model for the English marking? The 
answer is “yes,” but it is important to point out that the demise of overt dative 
marking (contrasting with the accusative) provided an impetus for alternative 
marking of possession. In this situation the likelihood of the adoption of a strategy 
from Celtic with which Old English was in contact was much increased. 


Contact and geographical spread 


Features which cluster in geographically confined areas and are found in lan- 
guages which are not genetically related (Noonan, this volume), or only distantly 
so, are said to characterize a linguistic area, such as the Balkans (Joseph, this 
volume), or, in a larger framework, South Asia (Schiffman, this volume). For this 
to occur, many centuries of prolonged contact and population interaction would 
seem to be required, especially as the common features of such areas belong to 
the closed classes of the languages involved, typically to the morphology and 
syntax. Furthermore, the languages of a linguistic area show not only internal 
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Table 1 Features with a considerable geographical spread 


Feature Geographical spread 


Front rounded vowels, | The north and center of Europe, excluding the British 


/y/ and /o/ Isles and the Iberian Peninsula 

Uvular /r/ [s] A wide band from northern France to southern 
Sweden 

Initial voicing of Flemish, Dutch and southern dialects of English 

fricatives 

Vowel epenthesis in The Netherlands and the north Rhenish dialects of 

syllable-coda clusters German 

High mid to front Scotland and Northern Ireland (all varieties of 


realization of /u/ [a] English, Scottish Gaelic and Irish) 


coherence but also recognizable external boundaries with languages immediately 
outside the area (Haspelmath 2004: 211). 

There are also cases where features show a considerable geographical spread 
without the preconditions of linguistic areas being met. Usually, single features 
are involved and often these are features which involve the sound systems of the 
languages in question. Table 1 shows a small selection of such features (with 
differing distribution sizes) taken from European languages. Such features are 
often prosodic or realizational (uvular [%], for instance, Bergs 2006) and while 
their geographical diffusion may be due to low-level copying of speech habits, 
they can in time achieve systemic status as is postulated for the spread of pros- 
odic factors in the South Asian context which resulted in systemic tone contrasts 
for many languages (Matisoff 2001). 


Regularization and language contact 


The outcome of language contact, either through borrowing or transfer, is of 
interest compared to internal developments, especially with regard to whether 
there is a difference in kind between the types of change resulting from these 
sources. One of the differences which may apply to internal but not external 
sources concerns the regularization of a language’s grammar. 

Regularization is a common phenomenon in internal change when this can be 
traced to phenomena in first language acquisition which are carried through into 
adulthood, leading in some cases to community-wide change. It is known that 
children overgeneralize patterns and constructions and in some cases these can 
survive beyond childhood, especially if the social environment is nonprescriptive, 
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e.g. nonliterate or pre-literate. To take an example: the continuation of over- 
generalization by language learners is probably how regularization of gender 
took place in Middle High German. Here disyllabic words in final -e became 
feminine because the majority were feminine anyway." Thus the word for ‘flower’ 
was formerly masculine but became feminine (> die Blume, Kluge & Seebold 2002: 
134), making it conform to the established pattern of words like die Sonne ‘sun’, 
die Decke ‘cover, ceiling’, etc. 

Returning to language contact, one can say that this type of development is not 
typical of contact, unless it is prolonged and intensive, i.e. continues over several 
generations. If that is the case then the same type of regularizing tendencies, 
observed for language learners in monolingual situations, can in theory occur in 
contact communities (see the discussion along these lines in Trudgill, this vol- 
ume). Furthermore, with long and close contact a regular system or subsystem 
could be adopted from another speech community, perhaps replacing an existing 
subsystem showing more irregularity than the new one. 

In this context it is worth considering what developments can be “beneficial” 
for a linguistic system. If a language borrows elements of a morphological para- 
digm then this will result in suppletion (paradigmatic irregularity) but the 
increase in formal distinctiveness between the elements of a paradigm improves 
communication, especially if these elements often stand in contrast to each other. 
Clear cases of “beneficial suppletion” include the borrowing of th- pronominal 
forms from Old Norse into Old English (see discussion above). The increase of 
formal distinctiveness is no doubt behind the development of English she (of 
disputed origin) which contrasts with he in its initial consonant. 


3 Terminology in Contact Studies 


In the discussion so far various terms have been used, e.g. borrowing, transfer, 
imposition. These are not always used with the same meanings by all authors, so 
it is advisable to offer some clarification. 


1 Borrowing Items/structures are copied from language X to language Y, but 
without speakers of Y shifting to X. In this simple form, borrowing is charac- 
teristic of “cultural” contact, e.g. Latin and English in the history of the latter, 
or English and other European languages today. Such borrowings are almost 
exclusively confined to words and phrases. 

2 Transfer During language shift, when speakers of language X are switching 
to language Y, they transfer features of their original native language X to Y. 
Where these features are already present in Y the transfer is imperceptible 
but where they are not in Y the transferred features represent an innovation. 
For grammatical structures one can distinguish (i) categories and (ii) their 
exponence (means of realization). Both (i) and (ii) can be transferred, or in 
some cases, only (i), see the discussion of habitual aspect in Irish English 
above. 
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(a) Supportive transfer: A feature in language X is also found in language Y 
ensuring its continuation in the shift variety of language Y. Example: 
Irish has a distinction between second person singular and plural for 
pronouns. This fact meant that in Irish English this distinction, available 
in vernacular varieties, was supported and continues to this day. 

(b) Innovative transfer: A feature in language X is not found in language Y so 
that its transfer constitutes an innovation in Y. Example: The immediate 
perfective of Irish was transferred to Irish English, e.g. They're after sinking 
the boat, representing an addition to the aspectual distinctions already 
available. This type of transfer is also referred to as interference, especially 
in the context of guided second language acquisition where the evaluative 
implication of “interference” is often intentional. 

Imposition A community contains two groups: a minority with high status 

and a majority with low status. The minority acquire the language of the 

majority, eventually relinquishing their original native language. In the process 
the majority adopt features of the shift variety of the high-status minority. 

Example: Anglo-Norman and Irish in late medieval Ireland where the Anglo- 

Normans “imposed” features of their shift variety of Irish on the majority 

native population (see discussion above). (NB There is a use of “imposition” 

which goes back to van Coetsem (1988), continued by van Coetsem (2000) 

and taken up, most notably, by Donald Winford (2003, this volume) which is 

essentially the same as “transfer” as defined in (2) above.) 

Metatypy Discussing a Melanesian case, Malcolm Ross (1996, 2001) coined 

the term “metatypy” to denote the sharing of organizational structures across 

languages in a situation where social attitudes disfavor the replication of 
concrete word forms whose origin in another language is easily identifiable. 


Metatypy is what gives rise to a Sprachbund or language alliance, where two or 
more languages are in contact over a lengthy period and become structurally 
more and more similar, as has happened with diverse Indo-European languages 
in the Balkans (Joseph 1983) and with Indo-Aryan and Dravidian languages in 
India (Emeneau 1980). But what happens to languages during this growth in 
similarity and how this process occurs are less well known. It is often assumed 
that languages simply grow more similar to each other, converging on some 
kind of mean. However, almost all case studies show a one-sided process: one 
language (the primary lect) adapts morphosyntactically to the constructions 
of another (the secondary lect), with no change occurring in the latter. (Ross 
2003: 183) 


Convergence A feature in language X has an internal source, i.e. there is a 
systemic motivation for the feature within language X, and the feature is 
present in a further language Y with which X is in contact. Both internal and 
external sources “converge” to produce the same result. Example: The pro- 
gressive form in English. The two main views on this are: (a) it was an 
independent development in English (Visser 1963-73; Mitchell 1985) or (b) it 
results from contact with Celtic (Dal 1952; Preufsler 1938; Wagner 1959; Braaten 


20 Raymond Hickey 


1967). A type of progressive structure in which a gerund was governed by a 
preposition existed in Old English: ic wees on huntunge ‘I was hunting’ (Braaten 
1967: 173). The step from this structure to I was hunting is small, involving 
only the deletion of the preposition. The fully developed progressive form 
appears in Middle English, but the apparent time delay between the contact 
with Celtic in the Old English period and the surfacing of the progressive 
later can be accounted for by the strong tradition of the written standard in 
Old English (Dal 1952: 113). The progressive is found in all the Celtic lan- 
guages and can be clearly recognized in the Irish structure ag + verbal noun 
as in Ta mé ag caint léi [is me at talking with-her] ‘I am talking to her’. This in 
itself is a good example of a locative expression for progressive aspect and is 
typologically parallel to Old English ic wes on huntunge. In both Old English 
and Celtic there was a progressive aspect, realized by means of a locative 
expression and with a similar functional range (Mittendorf & Poppe 2000: 
139). Both languages maintained this aspect and English lost the locative 
preposition, increasing the syntactic flexibility and range of the structure, 
perhaps under the supportive influence of contact with Celtic. 

“Convergence” is used here to refer to the coming together of internal and 
external factors to produce the same output, but the term can also be used to 
mean that two languages become more similar in structure, usually by one 
language approximating to the other (Ross 2001: 139). 


Borrowing and imposition 


Granted, the term “borrowing” is imprecise (nothing is “borrowed” from A to 
B), but the term is established in the field and its use ensures continuity with exist- 
ing literature. The term “copying” is in fact more accurate: speakers of language 
A copy features found in language B into their own language. This usage is 
found in Johanson (2002) and in both Johanson and Pakendorf (this volume). 

Van Coetsem’s use of the term “imposition” has two disadvantages: it denies 
the use of the intuitively more obvious “transfer” and precludes the use of “im- 
position” in the sense under (3) above. However, in other respects the approach 
initiated by van Coetsem has distinct merits, such as his highlighting of the 
relative linguistic dominance of languages. This can best be illustrated by an 
example. Consider the position of English and Spanish in the south-west/west 
of the United States, especially in the large urban centers such as Los Angeles 
(Silva-Corvalan 1994). First, Spanish influenced the English of the Chicanos, 
then after some time, the type of English they developed had a reverse influence 
back on their Spanish because for many their English has become more dominant 
(see the discussion in Fought, this volume). The same is true for other large 
immigrant groups, especially in later generations, e.g. the Turks in Germany 
whose Turkish is now influenced by German although for the first generation of 
immigrants in the 1960s Turkish was the dominant language and this influenced 
the kind of German they spoke. 
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4 Conclusion 


There can be little doubt that the value of contact explanations in linguistics has 
increased with the greater level of differentiation and nuance provided in the 
many analyses presented in the last 20 years or so (Thomason, this volume). 
It has ensured that, where linguists investigate language change and the rise of 
new features, the option of contact is viewed seriously. There should be no a 
priori preference for contact accounts nor should there be for purely language- 
internal explanations either, and indeed the combination of external and internal 
developments should also be considered. 

The amount of contact-induced change can vary in the development of a lan- 
guage. The period of contact, its intensity and duration, and the social setting 
are all factors which need to be weighed up carefully. Furthermore, languages 
naturally continue to develop after the dust of contact has settled. Some contact 
features establish themselves, especially in varieties which derive from earlier 
language shift, but others recede if not accepted by the speech community in 
later generations. A balanced consideration of all these factors is essential in 
determining the effects of contact on a speech community and, in the long term, 
the language change which it results in. 


NOTES 


1 Indeed there is now a dedicated electronic journal for language contact studies (access- 
ible at www,jlc-journal.org). 

2 On contact and the Slavic languages, see also Grenoble (this volume). 

3 See Zuckermann (2003) for another lexical investigation of contact, this time in the 
context of modern Hebrew. 

4 This work can involve using evidence from archaeology as well (Fortescue 1998). 

5 This approach had been anticipated to a degree by Bisang (1998). 

6 Some scholars did consider contact with Celtic along with other possibilities 
when examining particular instances of change in the history of English, see Ellegard 
(1953). 

7 New databases have been explored to offer new vistas on language contact, e.g. 
Ansaldo (2009) which looks at language contact and change in a South Asian context. 

8 On matters concerning the collection of such data, see Bowern (this volume). 

9 In the following sections the majority of examples are taken from languages in 
Ireland. The history of English and Irish is characterised by permanent contact and 
mutual influence so that many examples of different phenomena are attested (Hickey 
1995b). Given that I know this material best, I have chosen to use it for illustrations. 
However, similar situations and influences can be found in other scenarios of contact 
for other languages. 

10 Compare the similar use in modern Greek of the word filipineza (female immigrant 
from the Philippines) for ‘maid, domestic servant’. 
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11 Proto-Semitic is generally regarded as having VSO (Hetzron 1987: 662). How this arose 


is unaccounted for. 


12 If fossilization sets in, this situation can in fact become permanent. 
13 There are various terms for this phenomenon. A common one is to refer to it as the 


internal possessor construction. 


14 There are only one or two of the old masculines left, e.g. der Friede ‘peace’, der Same 


‘seed’. 
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SARAH THOMASON 


Language contact has been invoked with increasing frequency over the past two 
or three decades as a, or the, cause of a wide range of linguistic changes. 
Historical linguists have (of course) mainly addressed these changes from a 
diachronic perspective — that is, analyzing ways in which language contact has 
influenced lexical and/or structural developments over time. But sociolinguists, 
and many or most of the scholars who would characterize their specialty as con- 
tact linguistics, have focused on processes involving contact in analyzing synchronic 
variation. A few scholars have even argued that contact is the sole source of 
language variation and change; this extreme position is a neat counterpoint to an 
older position in historical linguistics, namely, that language contact is responsible 
only for lexical changes and quite minor structural changes. In this chapter I will 
argue that neither extreme position is viable. This argument will be developed 
through a survey of general types of contact explanations, especially explanations 
for changes over time, juxtaposed with a comparative survey of major causal 
factors in internally motivated language change. My goal is to show that both 
internal and external motivations are needed in any full account of language his- 
tory and, by implication, of synchronic variation. Progress in contact linguistics 
depends, in my opinion, on recognizing the complexity of change processes — on 
resisting the urge to offer a single simple explanation for all types of structural change. 

The structure of the chapter is as follows. Section 1 provides some background 
concepts and definitions, and sections 2 and 3 compare and contrast contact expla- 
nations with internal explanations of change. Section 4 is a brief conclusion that 
includes a warning about the need to be cautious in making claims about the causes 
of change — both because in most cases no cause can be firmly established and because 
of the real possibility that multiple causes are responsible for a particular change. 


1 Some Background Concepts 


First, what counts as language contact? The mere juxtaposition of two speakers 
of different languages, or two texts in different languages, is too trivial to count: 
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unless the speakers or the texts interact in some way, there can be no transfer of 
linguistic features in either direction. Only when there is some interaction does 
the possibility of a contact explanation for synchronic variation or diachronic change 
arise. Throughout human history, most language contacts have been face to face, 
and most often the people involved have a nontrivial degree of fluency in both 
languages. There are other possibilities, especially in the modern world with novel 
means of worldwide travel and mass communication: many contacts now occur 
through written language only. 

Second, what isn’t language contact? That is, under what circumstances is an 
internal explanation for variation and/or change the only possible explanation? 
This question is trickier than it might seem to be, because it can rarely or never 
be answered by ascertaining that there was no language contact when a variant 
entered a language. For one thing, it is difficult, and maybe impossible, to find a 
language anywhere that isn’t in contact with one or more other languages at any 
given time. For another, in a sense all linguistic variation and every linguistic change 
must necessarily involve language contact. This is true because there are always 
two steps in the establishment of an innovation in a speech community: the 
initiation of variation and change must begin with an innovation in one or more 
speakers’ speech, and the spread of that variant through a speech community is 
always a matter of transfer from speaker to speaker — i.e. via language, or at least 
dialect, contact. An innovation that remains confined to a single speaker cannot 
affect the language as a whole. In spite of this complication, it has long been tra- 
ditional to posit contact explanations for linguistic variation and change only when 
two or more different languages are concerned. 

Third, what is contact-induced change? Contact is a source of linguistic change 
if it is less likely that a particular change would have happened outside a specific 
contact situation. (Note that this is a definition, not a criterion for establishing 
contact-induced change: there is no way to measure “less likely” precisely for any 
past linguistic change.) The definition has several parts, all of them important. It 
specifies “a source” rather than “the source” because a growing body of evidence 
suggests that multiple causation — often a combination of an external and one or 
more internal causes — is responsible for a sizable number of changes. It specifies 
“Jess likely” rather than “impossible” because historical linguists are wary, with 
good reason, about making strong claims about impossible changes. And it 
specifies “a specific contact situation” because efforts to argue for contact-induced 
change without identifying a contact situation in which it could have occurred 
are doomed to failure. One encounters occasional arguments to the effect that 
(for instance) change x must have been due to some kind of substrate influence 
because it could not have happened through strictly internal means. This gener- 
ally means that the linguist making the claim can’t find an internal route to 
innovation; but appealing to an unidentified and unidentifiable substrate language 
cannot possibly explain anything — it just adds an extra layer of mystery to the 
historical puzzle. 

The motivation for appealing to a mystery language and inferring part of its 
structure from the structure of the putative receiving language appears to lie in 
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a belief that every linguistic change can be explained. This brings me to a fourth 
background point: in spite of dramatic progress toward explaining linguistic changes 
made in recent decades by historical linguists, variationists, and experimental 
linguists, it remains true that we have no adequate explanation for the vast major- 
ity of all linguistic changes that have been discovered. Worse, it may reasonably 
be said that we have no full explanation for any linguistic change, or for the 
emergence and spread of any linguistic variant. The reason is that, although it is 
often easy to find a motivation for an innovation, the combinations of social and 
linguistic factors that favor the success of one innovation and the failure of 
another are so complex that we can never (in my opinion) hope to achieve 
deterministic predictions in this area. Tendencies, yes; probabilities, yes; but we 
still won’t know why an innovation that becomes part of one language fails to 
establish itself in another language (or dialect) under apparently parallel cir- 
cumstances. There is an element of chance in many or most changes; and there 
is an element of more or less conscious choice in many changes. Even if we could 
pin down macro- and micro-social features and detailed linguistic features to a 
precise account (which we cannot), accident and the possibility of deliberate change 
would derail our efforts to make strong predictions. This assessment should not 
be taken as a defeatist stance, or discourage efforts to find causes of change: recent 
advances in establishing causes of changes after they have occurred and in track- 
ing the spread of innovations through a speech community have produced 
notable successes. More successes will surely follow; seeking causes of change is 
a lively and fruitful area of research. But the realistic goal is a deeper understanding 
of processes of change, not an ultimate means of predicting change. 

The point about not being able to explain most changes will perhaps become 
clearer if we consider the fate of most innovations. Speakers who are tired, or 
drunk, or nonnative, or three years old, or even just verbally inept frequently 
produce innovative forms. To take one type of example, collecting speech errors 
from weather reports read by radio broadcasters turns up examples like dreezle 
(in the sentence We could have some freezing dreezle) and frog (in the noun phrase 
patchy frog, a combination of frost and fog). Neither dreezle nor frog in this sense is 
likely to enter the language as a permanent feature (although smog, a combina- 
tion of smoke and fog exactly parallel to frog, has certainly done so); but the forms 
were uttered by native speakers, and that makes them linguistically possible changes 
in the language. Their subsequent history — whether they are ever used again by 
the same speakers and, if they are, whether they are adopted by other speakers 
— is a matter of linguistic and social probability, not possibility. It is easy to think 
of linguistic reasons why (for instance) frog might never turn up in an English 
dictionary (its homophony with the semantically unconnected pre-existing word 
frog would probably hurt its chances); but offering such an explanation for its 
ephemerality is a “Just So” story, mere speculation. There is no evidence to sup- 
port it. And the same is true of most or all lexical and structural innovations that 
don’t achieve ultimate success. 

Fifth, how can internal and external causes of change be established? 
Traditional ways of identifying internal causes of change have emphasized 
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learnability, in the form of pattern pressures, or structural imbalances, that make 
a particular set of forms or syntactic structures relatively hard to learn. In phono- 
logical systems, especially in the past 30 years or so, phonetic causes of change 
have been identified. There is now experimental research supporting certain 
claims of ease of learning, but to date these comprise only a very small fraction 
of past and ongoing changes. In another line of research, over the past 40 years 
variationists have studied the spread of variants within speech communities, 
identifying (for instance) social networks that contribute to the success of a given 
innovation. 

Establishing external causes of change has proved especially difficult, in 
part because most historical linguists used to be reluctant to admit any contact 
explanation for a structural change. Even now, there is a strong tendency to 
consider the possibility of external causation for a change only when the search 
for an internal cause has failed to produce a plausible result. This tendency has 
weakened in the past 20 years or so, but it has not vanished. It is therefore 
necessary to be quite explicit, and rigorous, about criteria for establishing 
contact-induced change. 

Here is a list of conditions that must be met (see Thomason 2001: 93-4 for 
further discussion and justification of these requirements). These conditions 
are usually needed only to support claims that structural features have been 
transferred without the morphemes that express them in the source language; 
given a reasonable amount of luck, loanwords will declare their origin, in which 
case no further effort is needed. Attaching an external cause to an innovative 
structural feature is much more difficult. The first requisite is to consider the pro- 
posed receiving language (let’s call it B) as a whole, not a single piece at a time: 
the chances that just one structural feature traveled from one language to another 
are vanishingly small. Second, identify a source language (call it A). This means 
identifying a language — or, if all speakers of A shifted to B, one or more closely 
related languages — that is, or was, in sufficiently intimate contact with B to 
permit the transfer of structural features. Third, find some shared features in A 
and B. They need not be identical in the two languages, and very often they won't 
be, because transferred features often don’t match in the source and receiving 
languages. They should, however, belong to a range of linguistic subsystems, e.g. 
both phonology and syntax, so as to rule out the possibility of structurally linked 
internal innovations. Fourth, prove that the features are old in A — that is, prove 
that the features are not innovations in A. And fifth, prove that the features are 
innovations in B, that is, that they did not exist in B before B came into close 
contact with A. For the sake of completeness, the search for causes should not 
end here, even if an external cause has been solidly established, because the influence 
of internal causal factors must also be considered. The best explanation for 
any linguistic change will take all discoverable causal factors into account, both 
internal and external. The rather extensive literature that attempts to decide 
between an internal and an external cause of a particular change is a waste of 
effort — the dichotomy is false, and the best historical explanation might well have 
to appeal to both causes. 
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If all five of the prerequisites can be met, then the case for contact-induced change 
is solid. But if one or more cannot be met, then any claim of external causation 
must be tentative at best. The most common weakness in a case for external cau- 
sation concerns requisites four and/or five. Here are two examples to illustrate 
an all too common situation. In the well-known Pacific Northwest Sprachbund 
(Oregon, Washington, British Columbia), a number of striking structural features 
are shared widely in all languages of the region. These include, among other things, 
typologically marked features such as velar versus uvular obstruents, labialized 
dorsal consonants, lateral obstruents, a weak noun/verb distinction, verb-initial 
sentence structure, optionality of plural inflection, and numeral classifiers. All 
specialists agree that this area is a true Sprachbund. But it turns out that all the 
most widespread areal features must be reconstructed for all three of the major 
language families in the region — Proto-Salishan, Proto-Wakashan, and Proto- 
Chimakuan. This means that we have no actual evidence for diffusion of any 
of the features. They could, in principle, either be inheritances from a common 
ancestor for the three families or due to diffusion among the three protolanguages. 
But just as we have no evidence of diffusion, because we can’t satisfy the fourth 
and fifth conditions for establishing contact-induced change, we also have no 
evidence that two or all three of the language families are changed later forms 
of a single parent language. (Such a relationship has been proposed, most pro- 
minently in recent times in Greenberg 1987, but Greenberg’s methods are con- 
sidered by virtually all specialists to be fatally flawed, and his results have not 
been accepted by historical linguists who specialize in Native American languages.) 
At least we can rule out accident as the source of the shared area-wide features, 
because the package is too specific and too unusual as a package to make acci- 
dental similarity an attractive hypothesis. But this does not help us to choose 
between the remaining two possibilities. 

This last point suggests that there may be no plausible historical explanation 
at all for a particular change or set of changes. This inference is correct. Our goal 
in analyzing linguistic changes, always, is to arrive at the best available his- 
torical explanation for a change. But in many cases, as noted above, no historical 
explanation is available; and in others, like the question of the origin of the shared 
area-wide features in the Pacific Northwest, two explanations are in principle 
available, but there is no supporting evidence for either of them. 


2 Contact Explanations and Internal Explanations of 
Change: Social Predictors 


Not surprisingly, contact-induced change can be viewed from a variety of per- 
spectives. The focus in this section and in section 3 is on factors that predict 
the outcome of change processes — that is, on social and linguistic explanatory 
factors. The ones presented here are discussed in more detail in Thomason 2001 
(ch. 4), though without the present emphasis on comparing and contrasting 
external and internal causation. As we will see, certain factors are unique to 
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contact-induced change, while others are shared with internally motivated 
change; but there do not appear to be any factors that are unique to internally 
motivated change. I will argue (following Dorian 1993) that there is often no clear- 
cut dichotomy between internally and externally motivated change, and that this 
makes the search for causal explanations even harder than it might otherwise be. 

Let us turn now to a consideration of some major predictors of linguistic 
change — not predictors in an absolute sense of predicting that change will occur, 
but predictors in the sense of conditions under which certain kinds of change 
can take place. The major (general) social factors that are relevant for predicting 
the effects of contact-induced change are the presence versus the absence of 
imperfect learning, intensity of contact, and speakers’ attitudes. Of these, the 
first will not be relevant for internal explanations of change, because the agents 
of internally motivated change are native speakers of the changing language, 
or nonnative speakers who have native-like fluency in the changing language, or 
incipient native speakers (if children produce innovations during the process of 
first-language acquisition). This is true both of the original innovator(s) of a novel 
linguistic feature and of those who participate in the spread of the innovation 
through a speech community. 

The presence or absence of imperfect learning by a group of people is, how- 
ever, a major predictor of the outcome of contact-induced change (Thomason 
& Kaufman 1988, Thomason 2001: 66-76). Here is a brief characterization of the 
contrasting expectations. When the agents of change are fluent speakers of the 
receiving language, the first and predominant interference features are lexical items 
belonging to the nonbasic vocabulary; later, under increasingly intense contact 
conditions, structural features and basic vocabulary may also be transferred from 
one language to the other. The only major type of exception to this prediction is 
found in communities where lexical borrowing is avoided for cultural reasons; 
in such communities, structural interference may occur with little or no lexical 
transfer. The prediction for the outcome of contact situations in which one group 
of speakers shifts to another language, and fails to learn it fully, stands in sharp 
contrast to a situation in which imperfect learning plays no role: in shift-induced 
interference, the first and predominant interference features are phonological and 
syntactic; lexical interference lags behind, and in some cases few or no lexical items 
are transferred from the shifting group’s original language to their version of the 
target language. Here too there is a class of potential exceptions. If the shifting 
group is a superstrate rather than a substrate population, then there may be a 
large number of transferred lexical items. (But it is doubtful that the famous case 
of the shift by Norman French speakers to English in England ca. 1200 CE, which 
is often cited as a prime example of superstrate shift, belongs in this category: 
by the time of the shift, the Norman French speakers in England were almost 
certainly fully bilingual, so that imperfect learning in fact played no role in the 
process of shift.) 

The second social factor that affects contact-induced change, intensity of con- 
tact, is relevant to both of these general types of contact-induced change. Where 
imperfect learning plays no role, intensity of contact is typically derived from things 
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like the duration of contact and the level of bilingualism in the receiving-language 
speech community. The longer the contact period and the greater the level of bilin- 
gualism, the more likely it is that structural features will be transferred along with 
lexical items. In shift-induced interference, intensity has to do with such factors 
as the relative sizes of the populations speaking the source and receiving languages, 
the degree of access to the target language by the shifting group, and the length 
of time over which the shift occurs: if there are many more shifting speakers than 
original target-language speakers, if the shifting group has only limited access to 
the target language, and if the shift occurs abruptly, a relatively large amount of 
shift-induced interference is likely. In the opposite situation, little or no shift-induced 
interference can be expected. 

A typical example of the former type is the variety of English of a community 
of Yiddish speakers in the United States. In this community, Yiddish-speaking 
immigrants learned English, but Yiddish was their first language and remained 
their main language. Their contact with native English speakers was somewhat 
limited; within their community, they were certainly the majority; and they 
learned English as a second (or third, or . . .) language rather abruptly. As a result, 
their English displayed moderate lexical interference but strong phonological and 
morphosyntactic interference from Yiddish, their first language (Rayfield 1970: 85). 
(These immigrants did not in fact shift from Yiddish to English; they remained 
Yiddish-dominant bilinguals throughout their lifetimes.) Ironically, given its 
reputation as an extreme case of (superstrate) shift-induced interference, the 
structural influence of Norman French on English illustrates the opposite set of 
conditions: between 1066 and ca. 1200, when the Normans finally shifted entirely 
to English in England, they were always vastly outnumbered by English speakers, 
and their access to English as spoken by native speakers was apparently unlimited. 
Nor was their shift to English abrupt; as noted above, by the time it occurred, the 
Norman population in England had apparently long been bilingual. Structural 
interference of French on English is quite modest, especially compared to the huge 
number of French loanwords. 

One complication here is that a significant amount of shift-induced interference 
is sometimes found in a long-term contact situation in which imperfect learning 
was important early on, but then bilingualism was established and maintained 
for a considerable period of time: the shifting group’s version of the target lan- 
guage was fixed at a time when the level of bilingualism was low among mem- 
bers of the shifting group, and (possibly for attitudinal reasons: see below) it never 
converged toward the target language as spoken by the original target-language 
speech community. An example is the French spoken in the (originally) Breton 
community of Ile de Groix, France: as of 1970, only the oldest inhabitants were 
fluent speakers of Breton, but the French spoken in the community was heavily 
influenced by Breton in both structure and vocabulary (Fowkes 1973: 195, in a 
review of Ternes 1970). 

Intensity of contact is a factor in internally motivated change if (and only if) 
we include under “contact” processes of person-to-person transmission of 
variants within a single speech community — that is, the spread of an innovation. 
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The concept of social networks, as developed in e.g. Milroy 1987, is of obvious 
relevance here: the idea is that variants spread through networks, and close-knit 
social networks characterized by intense contact among the participants can 
facilitate the spread of innovations. Intensity of contact of course plays no role in 
the initial innovation of an internally motivated change; it can be relevant only 
for the spread of a change. 

The third general social predictor, speakers’ attitudes, is certainly relevant both 
in contact-induced change and in internally motivated change. This is admittedly 
a very vague notion, but it is difficult to make it precise —- and equally difficult, 
unfortunately, to prove that it has affected the course of language history. 
Speakers may or may not be aware of the attitudinal factors that help to shape 
their linguistic choices, and historical linguists are (of course) unable to establish 
speakers’ attitudes except in the tiny handful of instances in which metalinguistic 
comments on innovations are found in old documents. To take just one of many 
frustrating examples, a striking areal feature of at least some parts of the Pacific 
Northwest Sprachbund is an avoidance of loanwords from French and English, 
the European languages with which the indigenous peoples first came into con- 
tact. In Montana Salish, for instance, there are hardly any English (or French) loan- 
words, in spite of massive cultural assimilation to European-derived culture; a 
similar situation obtains in Nez Perce, an unrelated language whose speakers have 
long had close contact with Salishan tribes. The two tribes have instead constructed 
descriptive words from native morphemes to name items imported from Anglo 
culture, as in the Montana Salish word for “automobile,” p’ip’iysn, literally ‘it 
has wrinkled feet’ (so named because of the appearance of tire tracks), and the 
Nez Perce word for “telephone,” cewcew’in’es, literally ‘a thing for whispering’. 
Modern elders, when asked by young tribal members how they would say (for 
example) “television set” in Salish, make up comparable words on the spot; but 
when asked why they don’t just use the English word with a Salish pronunci- 
ation, they don’t know — they merely shrug and say that’s how it is. In other 
words, the reason for this culturally determined pattern of lexical innovation is 
unknown to current tribal members, and was possibly never a conscious avoid- 
ance pattern. 

The clearest examples of speakers’ attitudes as a cause of change therefore come 
from cases of deliberate change (though even here it is often impossible to prove 
that a change was made with full conscious intent). Some of these are internally 
motivated, at least in the sense that no direct language contact was involved; 
others are externally motivated, according to the definition of contact-induced 
change given above in section 1. Language planning is one prominent source 
of internally motivated deliberate changes. For instance, the twentieth-century 
Estonian language reformer Johannes Aavik introduced hundreds of new words 
and a sizable number of morphosyntactic features into Estonian, and many of these 
became permanently fixed in the language (Saagpakk 1982, Thomason 2007). Several 
striking examples of contact-induced changes that must be explained by speakers’ 
attitudes come from situations in which a speech community wishes to distin- 
guish its language, or more likely its dialect, more sharply from its neighbors’ 
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speech. Changes range from lexical substitution in certain languages of Papua New 
Guinea (Kulick 1992: 2), to reversal of gender assignments, also in Papua New 
Guinea (Laycock 1982, Thomason 2007), to phonological distortion of words in 
Lambayeque Quechua in Peru (David Weber, p.c. 1999, citing research by Dwight 
Shaver), and to a combination of the first and third of these types of change in 
Mokki, a language of Baluchistan that was created by lexical substitutions and 
distortions (Bray 1913: 139-40). Lexical substitutions are also found, of course, 
in teenage slang, which is usually (at least in the United States) a product of 
internally motivated lexical innovation. 

A consideration of social predictors of externally and internally motivated lan- 
guage change could be extended to more fine-grained social factors, but the very 
general factors discussed here provide a good overall comparative picture of exter- 
nal and internal explanatory social factors. One of the three categories, speakers’ 
attitudes, seems to be equally effective in internal and external causation. A 
second, intensity of contact, is very important to both innovation and spread of 
an innovation in contact-induced change, but it is relevant only to the spread of 
an innovation in internally motivated change. The third category, the differential 
effects of interference depending on whether or not imperfect learning played a 
role, is relevant only for contact-induced change. 


3 Contact Explanations and Internal Explanations 
of Change: Linguistic Predictors 


As with social predictors, the discussion here of linguistic predictors is confined 
to quite general categories; here too the analysis could be extended to cover more 
specific linguistic factors, but the broader categories will serve to develop an over- 
all comparative picture. For contact-induced change, the most important linguistic 
predictors are typological distance, universal markedness (with its ultimate appeal 
to ease of learning), and degree of integration within a linguistic system. The first 
of these is relevant only to contact-induced change, but the other two are equally 
important for internally and externally motivated change. 

Typological distance is a (very informal) measure of structural differences 
between two linguistic systems; it cannot be relevant for making predictions about 
internally motivated change because internal motivations arise within the same 
overall system. If different dialects are involved in motivating a change, then that 
is contact-induced change, not internally motivated change. If differences exist 
in two parts of the same system — say, in two different inflectional classes in a 
language’s system of noun declension — then analogic changes that bring about 
leveling of the two classes are indeed akin to contact-induced change if they are 
interpreted as interference between two different (sub)systems. But in such a case 
the typological distance between the two (sub)systems is minimal or zero: with 
minor variations, noun classes in a single language will share the same structural 
morphological and syntactic properties. 
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In contact-induced change, the degree of typological distance between specific 
subsystems of a source language and a receiving language helps to predict the 
kinds of interference that may occur under differing degrees of contact intensity. 
Where typological distance is small, linguistic subsystems in which contact- 
induced change is in general rare may undergo contact-induced change. This 
principle is most obviously illustrated by changes in the inflectional morphology, 
which tends to lag behind phonology and syntax and even derivational morphology 
in a catalogue of interference features, in both shift-induced interference and 
interference in which imperfect learning plays no role. 

Specifically, minimal typological distance is in part responsible for the frequency 
of interdialectal interference involving inflectional features that are rarely trans- 
ferred in cases of foreign interference. An example is the realignment of syncretic 
categories in the plural oblique cases of masculine and neuter nouns in certain 
nonstandard dialects of (the language formerly known as) Serbo-Croatian. Where 
Standard Serbo-Croatian has syncretism in the Dative-Instrumental-Locative 
plural, whose suffix -ima is opposed to the Genitive plural suffix -a:, some non- 
standard dialects of the Cakavian dialect group had instead a Genitive-Locative 
suffix -i:h versus Dative plural -o:n and Instrumental plural -i. But in the major 
study of one of these ¢akavian dialects, that of the Adriatic island Hvar, the ana- 
lyst found that only older speakers still retained the inherited Cakavian system; 
by contrast, younger speakers were using -i:h only for the Genitive plural, and 
they had borrowed the standard dialect’s suffix -ima for the Dative, Instrumental, 
and Locative plural cases (Hraste 1935: 17-25). It is important to emphasize here 
that the role of typological distance in this and other cases is not to motivate the 
change itself; social factors — in this instance an expanded educational system that 
exposed younger speakers heavily to the standard dialect, together with the pres- 
tige of that dialect — were surely the primary cause of the change. (Note that when 
I conducted dialect research in Yugoslavia in 1965 it was still fairly easy to find 
villagers as young as 60 who had never attended school. Ten years later this would 
almost certainly have been impossible.) The role of typological distance, in a 
case like this one, is to make possible a type of change that would be a great deal 
less likely if the source and receiving systems differed in some or all of their 
inflectional categories (gender, number, case, and/or noun class). 

Minimal typological distance sometimes does facilitate otherwise rare types of 
contact-induced change in different languages, of course. A well-known example 
is the borrowing of Bulgarian person/number verbal suffixes in Megleno- 
Rumanian, yielding double-marked verb forms: original Megleno-Rumanian 
forms like aflu ‘I find’ and afli ‘you find’ were already marked for 1sg and 2sg 
subject, respectively, but the borrowed Bulgarian 1sg and 2sg suffixes -m and -& 
were nevertheless added, producing aflum and aflis. Bulgarian and Megleno- 
Rumanian are only distantly related to each other, but their verbal systems have 
the same person—number combinations, so that these redundant suffixes could 
be added without any adjustment in the categories — there was already a one-to- 
one correspondence in the expression of 1sg and 2sg in verb inflection. In other 
words, the typological distance between the two languages at this structure point 
was Zero. 
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It is far more common, however, for different languages to differ significantly 
in inflectional and other categories, and in these cases typological distance is likely 
to prove a barrier to contact-induced change. Not an absolute barrier: under 
circumstances of intense contact, any linguistic feature can be transferred to 
any other language. But the effect of typological distance when typologically 
dissimilar languages are in intimate contact is reflected in the various borrowing 
scales that have been proposed in the literature (e.g. in Thomason 2001: 69-71): 
with casual contact, only easy-to-borrow items can be transferred from source 
to receiving language, primarily or only nonbasic vocabulary; with increasingly 
intense contact, harder-to-borrow features may be transferred, ultimately includ- 
ing even inflectional morphology, basic vocabulary, and typologically disruptive 
features (e.g. prefixes introduced into a language that was previously exclusively 
suffixing). 

Importantly, no borrowing scale can be valid for cases of shift-induced inter- 
ference, because in shift-induced interference the primary agents of contact-induced 
change are speakers of the source language who have learned the receiving 
language imperfectly. In particular, neither introducing features from one’s 
first language into a second language nor failing to learn aspects of the second 
language (the target language) requires any great knowledge of the second lan- 
guage — by definition, in the case of learning failure. The validity of borrowing 
scales therefore rests on their appropriateness for cases of interference in which 
imperfect learning plays no role. 

Certain features that appear at the hard-to-borrow end of a borrowing scale 
are hard to borrow because they are relatively hard to learn (typically, because 
they are universally marked): you can’t borrow what you don’t know, so only 
the most fluent bilinguals can introduce hard-to-learn features into their second 
language. But part of the reason that features are hard to borrow has to do with 
the degree to which they are integrated into the linguistic system. This is why 
inflectional features are so much less likely than other structural features to be 
transferred from one language to another: inflectional systems tend to be very 
tightly integrated in a system of interlocking structural relationships, and it is 
thus likely to be difficult both to extract a single form or set of forms from one 
system and to insert foreign forms into another inflectional system, especially 
one that is typologically incompatible. (See below for more detailed discussion 
of the role of markedness and of systemic integration.) 

In fairly intense contact situations where imperfect learning plays no role and 
where the languages involved are very different typologically, then, interference 
features are most likely to include much nonbasic vocabulary, some function words 
and derivational morphology, borrowed phonetic and phonological features 
confined to loanwords, and borrowed syntactic features that do not cause major 
typological change. But toward the extreme end of the borrowing scale, where 
contact is very intense, typologically significant contact-induced changes may 
occur: borrowed basic vocabulary, borrowed phonology and phonetics in native 
vocabulary, borrowed syntactic features that do alter the receiving language’s 
syntactic typology, and even borrowed inflectional categories and patterns. The 
Iranian language Ossetic, which has borrowed extensively from neighboring 
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Caucasian languages, provides an example from the middle of the scale, with 
moderate structural interference. In addition to many loanwords from Georgian 
(a South Caucasian language), Ossetic has borrowed ejective stops, which appear 
even in native Ossetic vocabulary, a declensional system with more cases than 
Ossetic inherited from Proto-Indo-European and Proto-Iranian, agglutination 
(replacing the largely flexional morphology typical of Indo-European languages), 
and a more rigid SOV word order, with more postpositions, than one finds in 
Iranian languages that have not undergone interference from Georgian or other 
Caucasian languages. 

In sum, the popularity of borrowing scales, and the fact that they seem to be 
largely valid (in the absence of powerful social factors that skew the results, such 
as a culturally motivated disinclination to borrow words from certain languages), 
indicate the need for ideal social conditions — in the form of very strong cultural 
pressure from a dominant language — in order to override typological barriers to 
the transfer of hard-to-borrow structural features. It may reasonably be surmised 
that typological barriers could also be broken down in situations where attitu- 
dinal factors favor the adoption of typologically disruptive innovations. Here are 
two examples that illustrate this possibility, though both are admittedly minor in 
terms of their current and probable ultimate effects on the receiving languages’ 
systems (see Thomason 2007 for further discussion). 

First, the remaining speakers of Ma’a, a mixed language of northeastern 
Tanzania, are bilingual in Shambala and in fact form part of the Shambala speech 
community. Ma’a combines Cushitic lexicon and a few residual Cushitic struc- 
tural features with Bantu lexicon and structure, plus a sizable component of Masai 
words; its complex history involves heavy Bantuization of an originally non-Bantu 
(presumably Cushitic) language, and also a near-complete shift from Ma’a to 
Shambala, with retention only of some non-Bantu lexicon — and one phonological 
feature that is foreign to the nearby Bantu languages, a voiceless lateral fricative 
/t/. Nowadays Ma’a speakers emphasize the distinctness of their heritage language 
by introducing this fricative into their Bantu discourse, using it not only in Ma’a 
words but also in Bantu words (Mous 1994: 199). The effect is to make their Bantu 
speech less Bantu-like, and apparently also to impress their non-Ma’a-speaking 
Bantu interlocutors with their ability to pronounce such an exotic sound. The intro- 
duction of this fricative into Shambala is both phonetically and typologically novel. 

The second example is syntactic accommodation to English structure in two unre- 
lated Native American languages of North America, Montana Salish and Nisgha. 
Neither case involves actual language change; but in both cases the deliberate inno- 
vations attest to the possibility of future contact-induced syntactic change that would 
be typologically disruptive in a major way (although Montana Salish, at least, will 
lose its remaining fluent native speakers within the next 20 or 30 years, too short 
a period to see any major syntactic changes). Montana Salish is a polysynthetic 
language with morphological marking of up to three arguments in its verbal 
structure and many other verbal affixes that express additional inflectional and 
derivational features. Its basic sentential word order pattern is VOS. Ten or 
twelve years ago, in an elicitation session focusing on ditransitive verbs, I asked 
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an elder for sentences with glosses like ‘Johnny stole huckleberries from Mary.’ 
Again and again my consultant responded with sentences like Coni naq’” t st’Sa 
tl’ Mali (lit. ‘Johnny steal PARTICLE huckleberry from Mary’) - very close to a word- 
for-word translation from English — instead of the expected Naq“’-m-d-t-s Mali ci t 
st’$a Coni (lit. ‘steal-derived.transitive-relational-transitive-he Mary that PARTICLE: 
from huckleberry PARTICLE Johnny’). The former sentence, though perfectly 
grammatical in Montana Salish, is bizarre in isolation or indeed in any context 
other than a very particular discourse context (usually involving a change of agent 
from a previous sentence). But it is very “Englishy.” The latter sentence, with all 
the elaborate morphological apparatus of a ditransitive verb and with the usual 
marking of full-noun arguments, is what would be expected in isolation and 
in almost all ordinary discourse contexts. When I finally asked the elder if he 
didn’t think the translations he was giving me were rather, um, Englishy, he was 
surprised: Yes, of course they are, he said; but you asked in English, so I thought 
that’s what you wanted. 

The Nisgha example is partly parallel. In working with speakers of Nisgha, in 
which objects are deleted under identity with the object of a previous clause 
(as in pseudo-English They heard him but couldn't see) except in emphatic contexts, 
Tarpent found that Nisgha—English bilinguals tended to insert object pronouns 
into their Nisgha speech when they were trying “to approximate the English utter- 
ance” (Tarpent 1987: 157-8). She also observed that “some bilingual speakers .. . 
tend to stick very close to English surface structure” when they translate English 
sentences, resulting in very strange-sounding Nisgha. 

Let us turn now to the other two major linguistic predictors of contact-induced 
change, the two that are also relevant for internal explanations of language 
change: universal markedness and degree of integration within a linguistic 
system. Markedness, as noted above, is ultimately connected with ease of learn- 
ing. Universally marked features are believed to be those that are harder to learn; 
unmarked features should be easier to learn. The amount of evidence available 
to support this connection is unfortunately limited, but two main lines of evidence 
— frequency of cross-linguistic occurrence (expected to be greater for unmarked 
than for marked features) and age at which children learn them in acquiring their 
first language (earlier for unmarked features, later for marked features) do con- 
verge on certain features, especially phonological ones. To take a rather trivial 
example, a phoneme /t/ is close to universal in the languages of the world, while 
a phoneme /@/ is quite rare; and /t/ is acquired earlier than /@/ by children 
learning English as a first language. So, arguably, /t/ is less marked than /0/. 

Markedness plays an important role in shift-induced interference, both because 
the shifting group is more likely to learn unmarked target-language features than 
marked ones and because, if the two speaker groups eventually merge into a 
single speech community, original target-language speakers are less likely to adopt 
marked features than unmarked ones from the learners’ version of the target 
language. But in contact-induced changes that do not involve imperfect learning, 
markedness is likely to be of considerably less importance, and may not play any 
role at all. The reason is that the agents of change in this type of interference are 
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fluent speakers of the receiving language and of (at least the relevant aspects of) 
the source language as well, so ease of learning does not enter the picture; if they 
borrow structural features, they are as likely to incorporate marked as unmarked 
features. The Ma’a example above is a case in point: the reason for introducing 
the lateral fricative into Ma’a speakers’ Bantu speech was precisely its exoticness, 
its status as a hard-to-pronounce phoneme. 

In internally motivated change, ease of learning is the major component of the 
phenomenon known as drift, the only internal factor traditionally recognized as 
a cause of language change. (The other two traditional causal factors are exter- 
nal: dialect borrowing and foreign interference. These external factors are of course 
sometimes difficult or impossible to distinguish from one another.) The idea behind 
the concept of drift (which was first proposed, though with now controversial 
trappings, in Sapir 1921) is that pattern pressures, or structural imbalances, 
motivate language change because they are structure points that cause learning 
problems. Irregularities are harder to learn than regular patterns, for instance; and 
marked features, which by definition impose a burden on learning, are relatively 
hard to learn and are therefore likely to be diachronically unstable. A well-known 
characteristic of drift is that it often brings about similar or identical changes in 
related languages, especially during the period immediately following the split 
of two or more sister languages from their common parent — because all the 
daughter languages will have inherited the same pattern pressures from the 
parent, and efforts to ease the learning burden may well take the same route in 
each. In the Indo-European language family, for instance, the history of all extant 
branches displays a tendency toward inflectional simplification (with resulting 
complication in the syntax); in noun declension, for example, even those languages 
that best retain many aspects of Proto-Indo-European noun declension have 
collapsed several inherited noun classes, which were semantically opaque, into 
just three, with a few residual forms from other classes. 

It is hardly surprising, given that markedness plays an important role both 
in shift-induced interference and in internally motivated structural change, that 
different (unrelated) languages can undergo the same change for quite different 
reasons. It therefore makes no sense to ask (for instance) whether the loss of 
certain plural case distinctions in the Serbo-Croatian dialect of Hvar is due to 
contact-induced change or to the same long-term process of drift that has eroded 
Indo-European inflectional distinctions over many centuries. Of course in that 
case there is no doubt about the contact influence — the borrowed Dative- 
Instrumental-Locative plural suffix -ima attests to its standard-dialect origin — but 
drift may well also have played a role in motivating the change. And if (as is 
often the case) it had been merely the standard-language Dative = Instrumental 
= Locative plural pattern that was involved, and not the actual suffix, it would 
still make no sense to assume that a choice between an internal cause and an 
external cause was necessary. 

The third major linguistic causal factor, degree of integration into the system, 
also has its place in explaining both internally and externally motivated changes, 
although here internal causation is somewhat less clear than external causation. 
Contact-induced changes are much more common in loosely integrated linguistic 
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subsystems than in subsystems which - like the inflectional morphology — are 
characterized by sets of interconnected forms organized into paradigms. In fact, 
as noted above, inflectional morphology is the linguistic subsystem that is least 
likely to be affected by contact-induced change, owing in part to typological dis- 
tance (precise category matching in the inflectional systems of different languages, 
except with close relatives or in certain Sprachbund situations, is unlikely) and 
in part to the fact that the inflectional morphology typically displays the highest 
degree of integration of elements, of interlocking structure. Transfer of inflectional 
features from one language to another is therefore likely to happen only under 
very intense contact conditions, including processes of language shift as well as 
changes in which fluent bilinguals are the agents. And most such changes fit into 
the receiving language’s inflectional patterns without significant typological dis- 
ruption. So, for instance, Lithuanian acquired two new noun cases when speakers 
of Finnic languages, which are rich in case distinctions, shifted to Lithuanian; 
but the cases fit naturally into the already case-rich Lithuanian system of noun 
declension and caused no typological change. As we saw above, the same is true 
of the Bulgarian person/number suffixes added to Megleno-Rumanian verbs. In 
general, however, less close-knit subsystems more readily admit new elements; 
that is why nonbasic vocabulary is the predominant category of interference 
features in contact situations where imperfect learning plays no role. 

It is far from obvious that tightly integrated inflectional systems tend to resist 
internally motivated change, or that they change more slowly than other linguistic 
subsystems (although claims of super-stable morphology have certainly been made; 
see Thomason 1980 and sources cited there for discussion). Analogic changes within 
inflectional paradigms (as well as in other subsystems) are routine occurrences 
in languages all over the world. One might argue, however, that the analogic 
processes themselves arise out of tightly integrated systems, where partial regu- 
larities can be found on multiple axes and thus motivate analogic changes. In this 
sense, the degree of integration can be said to serve as a predictor of internally 
motivated change. 

In sum, like the effects of social predictors of change, the effects of linguistic 
causal factors are unevenly distributed between externally and internally motiv- 
ated change. Of all the predictors we have surveyed, two — presence or absence 
of imperfect learning and typological distance — are restricted to contact expla- 
nations. Two others, intensity of contact and degree of integration within a system, 
are highly relevant in explaining contact-induced change but (apparently) of less 
relevance in explanations of internally motivated change. And the remaining two 
factors, speakers’ attitudes and markedness, are significant in both types of expla- 
nation, although in contact-induced change the effects of markedness are largely 
restricted to shift-induced interference, where imperfect learning is important. 


4 Conclusion 


What, then, is the value of contact explanations in linguistics? This very general 
question has not been fully answered in this chapter, but by surveying causal 
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factors in both externally and internally motivated change, I hope to have 
sketched the beginning of an answer. In analyzing the linguistic effects of lan- 
guage contact, the linguist needs to consider all possible causal factors. If any 
factors can be ruled out, the task of explaining the changes should become 
easier. Only some predictors discussed above have relevance for internally motiv- 
ated change, so the range of internal explanations for a given set of changes in 
an intense contact situation is narrower to begin with than the range of potential 
contact explanations. The search for internal and external causation should pro- 
ceed in parallel, because of the strong possibility, an actuality in many cases, that 
both internal and external causes influenced the linguistic outcome of change. 

We have also seen that certain social factors — most obviously the presence or 
absence of imperfect learning and intensity of contact — set the stage for different 
linguistic outcomes. For example, if contact is intense enough, especially if no imper- 
fect learning is involved, then typological distance is no barrier to extensive 
structural borrowing; to take another example, speakers’ attitudes can trump expec- 
tations for types and degree of both externally and internally motivated change. 
In other words, in this domain social factors rule. This of course does not mean 
that linguistic predictors are necessarily less important or less significant in a given 
case than social predictors. It only means that, in cases where linguistic and social 
factors point to different outcomes, the social factors will be more effective. 

A caveat is needed at this point: although it is easy to find clear examples of 
causation, with all the causal factors discussed here, it remains true that for the 
vast majority of known linguistic changes there is no adequate explanation. Some 
of the reasons are obvious: often, in past contact situations, too little is known 
of the social and linguistic circumstances to satisfy the requisites for establishing 
contact-induced change, and the same is unfortunately true for internal causation. 
This should not discourage the search for causes. The search itself is illuminat- 
ing, and concentrating on changes for which we can build a solid explanatory 
account helps to advance our understanding of the probabilities of change. 

This leads to a final point. Historical linguists and sociolinguists employ such 
different methods that they sometimes misunderstand each other’s results; and, 
more ominously, specialists in each subfield are sometimes inclined to dismiss 
the results achieved in the other subfield. This inclination, if indulged, diminishes 
our chances of arriving at a single unified account of contact-induced change, and 
of contact explanations more generally. There can surely be no doubt that a satis- 
factory theoretical understanding of contact phenomena must be compatible with 
the results of both subfields - and must, for that reason, help each set of specialists 
to a better understanding of the processes they study. 
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2 Genetic Classification and 
Language Contact 


MICHAEL NOONAN' 


1 Introduction 


Until recently, within orthodox linguistic circles, there probably would have 
been little to say about the relation between genetic classification of languages 
and language contact except that the latter was irrelevant to the former. 
Languages might indeed come into contact and various aspects of grammar 
might be borrowed from one language to another, but such borrowing did not 
affect a language’s genetic classification, which was determined by the retention 
of inherited morphemes through the process of regular generational transmission, 
and which was scientifically established by comparing only inherited, not bor- 
rowed, morphemes through the comparative method. 

Over the years, a number of linguists have expressed reservations about many 
of the assumptions underlying this view, and in this paper I will examine both 
the orthodox view and various alternatives to it. The paper will begin with a dis- 
cussion of what it might mean to say that two languages are genetically related. 
I will follow this with a discussion of models of genetic relatedness, paying 
special attention to the widely accepted family tree model and the assumptions 
that underlie it. I will then consider various outcomes of language contact and 
discuss what sorts of models of genetic relatedness these are most compatible with. 
Lastly, I will address the topic of speciation — the creation of new languages — 
and language contact. 


2 What Do We Mean by the Genetic Classification 
of Languages? 


To the layperson, it is perhaps not immediately obvious what could be meant 
by the “genetic” classification of languages. Languages are not living organisms 
for which descent, implying as it does birth, parenting, and death, could unprob- 
lematically apply. A layperson’s guess might be that, in speaking about the 
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genetics of languages, linguists are equating the genetics of language with the 
genetics of the people who speak them. 

From the standpoint of contemporary linguistics, this is surely not the case. 
Regardless of what one thinks about the various theories which have been put 
forward asserting a strong genetic influence on the structuring of languages, all 
mainstream linguists would agree that any child can learn any language natively, 
and that the child’s genetic background has no effect on the genetic classification 
of the language the child learns.’ It is also surely the case that mainstream 
linguists would agree that languages are in important ways cultural artifacts, con- 
sisting in some sense of cultural memes (Dawkins 1976), and that these linguistic 
memes are subject to the same sorts of pressures and changes that other memic 
systems are subject to. 

So, languages are not linked genetically to the people who speak them, and 
they are asserted to be in some sense cultural artifacts: a specific language is 
learned by a child because of the circumstances of his or her birth and rearing, 
and languages, like other aspects of culture, change through time. When speak- 
ing of other cultural artifacts, we can, of course, talk informally about “descent,” 
in the sense that one can say that in some ways American culture descends 
from British culture. But it is certainly not standard practice in such discussions 
to draw family trees and speak about genetic relatedness. Why is language 
different? 

One reason language is different is that, unlike something as amorphous as 
a culture, languages have traditionally been viewed as consisting of a finite set 
of relatively easily identifiable entities - words, grammatical affixes, rules, etc. — 
which are organized systematically and which can in principle be compared in 
a straightforward manner. Such comparisons can form the basis for assessing 
relatedness among languages. 

Linguistic constructs, of course, can be related in a number of ways. The 
field of linguistic typology, for example, is concerned with an assessment and 
evaluation of the similarities and differences of various linguistic features or 
combinations of features across languages. But typological classifications are 
not genetic classifications.” The traditional genetic classification of a language, at 
least at the higher taxonomic levels, tells us very little about the structure of the 
language — less, for example, than knowing where in the world a language is 
spoken.’ Traditional genetic classification of a language also tells us very little about 
the source of morphemes employed by speakers: the morpheme inventories of 
many languages can easily be shown to contain a majority of forms borrowed 
from languages outside their immediate taxonomic units. Chantyal, as just one 
example, is classified as a Sino-Tibetan language, yet its morpheme inventory over- 
whelmingly consists of borrowings from Indo-European Nepali (Noonan 2003); 
English is classed as a Germanic language, though its morpheme inventory is largely 
drawn from Italic and Greek. 

If the genetics of speakers, typological similarity, and a percentage assessment 
of the source of morphemes in a language are not relevant for telling us what 
genetic relationships among languages mean, then what is? Three approaches to 
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this issue can be found in the literature, though they are usually merely assumed 
rather than explicitly argued for. 

The first approach I will label the generational transmission approach. This is 
conceptually the simplest and is almost never discussed explicitly but is merely 
assumed. In this way of looking at things, assessing the genetic relatedness of 
languages amounts to assessing the history of the generational transmission 
of linguistic traditions. By “generational transmission of linguistic traditions” I 
mean the acquisition by children of essentially the same linguistic system that their 
parents acquired as children.* In this way, English is a West Germanic language 
because if one traces the history of the generational tradition of the language we 
now call English, we will find that it merges with the linguistic traditions we call 
Dutch, German, etc. approximately 1,400 years ago. Similarly, Irish and Hindi, 
despite their radically different typologies, can be shown through various sorts 
of evidence to be traceable back through generational transmission to a common 
language called Proto-Indo-European. 

The second of these approaches I will label the essentialist approach, following 
Croft (2000: 197). This position maintains that there are certain linguistic features, 
consisting both of grammatical morphemes and characteristic morphosyntactic 
features, that must be transmitted along a genetic line for a language to be con- 
sidered a member of a given taxonomic unit. This is not to say that these features 
over time cannot change. It maintains only that in assessing potential mother— 
daughter relationships, these features must be transmitted; language relatedness 
is assessed along chains of transmission of these features from mother language 
to daughter language. Croft attributes this position to Thomason and Kaufman 
(1988), who assume it as part of their discussion of normal versus abnormal trans- 
mission of language, but something similar seems to have been accepted by other 
scholars over a long period as we will see in the discussion of the treatment of 
creoles and other language varieties in genetic linguistics. 

The first two approaches will ordinarily yield the same analyses, but they differ 
conceptually, and this conceptual difference has consequences in certain cases, 
as will be discussed in below in section 4. Both approaches are fully compatible 
with the comparative method, a technique for verifying genetic relations that was 
developed in the nineteenth century and reached its mature form under the 
Neogrammarians. While both approaches are compatible with the comparative 
method, the essentialist approach incorporates some of its assumptions into 
the approach itself. It does this by asserting that some of the material utilized by 
the comparative method to demonstrate linguistic relatedness is not required to 
have a parent-offspring genetic relationship at all.” 

The third approach, which I will label the hybrid approach, is really a conflation 
of several distinct approaches, ranging from the wave theorists (e.g. Schmidt 
1872) to contemporary comparativists, e.g. the Sino-Tibetanists Benedict (1972), 
Chappell (2001), Matisoff (2001), Pulleyblank (1998), to contemporary theoreticians 
(Croft 2000; Laks 2002), and to creolists (Holm 1988, Mufwene 2002). What these 
approaches have in common is the idea that, at least in some circumstances, 
languages may be mixed, hybrids of otherwise valid taxonomic units.° This is 
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different from the generational transmission approach and the essentialist 
approach, both of which disallow hybridity in the assessment of genetic relations. 
The hybrid approach is not fully compatible with the comparative method, which 
requires descent along a single genetic line for its operation, though many of 
those who have subscribed to hybrid approaches use the comparative method, 
but assert that it is not always applicable — or not straightforwardly applicable — 
in all cases.’ 

A hybrid approach takes the position that a language is a collection of entities 
(morphemes, grammatical constructs, etc.) that may have multiple sources. At some 
point, the mixture of forms may become so great as to preclude the assignment 
of the language to a specific taxon within a hierarchy of taxonomic levels, though 
it might still easily be placed within a higher level.* Most linguists these days would 
concede that true “mixed languages” exist, e.g. Copper Island Aleut, Michif, Media 
Lengua, etc., but would relegate them to a category outside the normal develop- 
ment of languages — that is, outside any genetic line. Others would include 
creoles in the category of hybrid languages, while still others would include in 
this category at least some non-creoles as well. 

In the discussion that follows, I will have rather less to say about hybrid 
approaches than the generational transmission and essentialist approaches simply 
because the latter two have been accepted by many more linguists over the years 
and because none of the hybrid approaches has yet attracted a substantial and 
influential number of adherents. 

Before proceeding further, it might be worth asking what genetic classification 
is good for. It has already been stated that genetic classification is not always use- 
ful in providing information about the structure of a language or its morpheme 
inventory, the more so the higher up the taxonomic ladder one goes. Information 
about where in the world a language is spoken provides more useful information 
about grammatical structure, but we don’t have classifications of languages by 
region that are comparable to genetic classifications. On the positive side, how- 
ever, genetic classification has proven a boon to historical linguistics, providing 
the superstructure around which theories of language change have developed over 
the last two centuries. Such classifications also, potentially, provide information 
of considerable historical value. Typologists use genetic classifications to explain 
similarities among languages and as a consideration in constructing cross-linguistic 
samples. And, of course, most of us find satisfying the classification of familiar 
things: typically the first thing a linguist will ask on being told of an unfamiliar 
language is: “What family does it belong to?” 


3 Models of Language Families in Genetic 
Linguistics 
In the last section, I discussed three approaches to the question of what genetic 


relatedness for languages might mean. These approaches are primarily concerned 
with language creation and not directly with how more remote relations among 
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languages might be dealt with. It remains now to discuss models of more remote 
relations. 

The term “genetic” strongly suggests a biological model or analog for the class- 
ification of languages, and indeed linguists have employed biologically inspired 
models.” It is worth noting at the outset of this section that an explanation for why 
a biological analogy employing the term “genetic” should be applied to languages 
alone among cultural artifacts is virtually absent in the linguistics literature,’ where 
by long tradition going back over two hundred years most linguists have simply 
assumed the validity of a biological analog in linguistic classification.” 

Within a biologically inspired framework, there are at least two possible 
classes of interpretations of genetic relatedness. One could conceive of languages 
as unitary organisms and consider relatedness in a way analogous to that of indi- 
vidual animals or plants, which can be related via lineages created through sexual 
or asexual reproduction. Alternatively, one could conceive of a language as a 
population, either of speakers or of linguistic constructs,’* or even of a population 
of speakers each with his/her idiolect and hence his/her own set of linguistic 
constructs.’* Population models of this sort might adopt a species analogy for under- 
standing genetic relatedness." 

In linguistics, the unitary organism model was the one adopted by historical 
linguists in the early nineteenth century; this model has survived as the received 
mode of understanding genetic relations to the present day. Within this model, 
two languages are said to be genetically related if they descend from a common 
ancestor. Since it is at least possible that all languages descend from a common 
ancestor, languages are usually claimed to be related only if their relatedness can 
be established through the comparative method or some alternative procedure.” 

In principle, a unitary organism model could adopt either an asexual 
(parthenogenetic) or a sexual model for conceptualizing genetic relatedness. 
The established model, known as the family tree or Stammbaum model, adopted 
parthenogenetic (asexual) reproduction as the mode for understanding genetic 
relationships among languages. The expressions mother/ancestor language and 
daughter language are components of the model and reflect the original analogy, 
as do the notions of language birth and language death. 

Of the approaches to the nature of genetic classification discussed in section 2, 
the generational transmission approach and the essentialist approach are fully 
compatible with the family tree model and for the most part seem to presuppose 
it, though in a few special cases they can allow for developments that are dis- 
allowed by the family tree model. Hybrid approaches are compatible with all the 
alternative models, namely a unitary organism model that supposes (or allows) 
sexual reproduction, and the various sorts of population models as a few scholars 
(e.g. Croft 2000, Mufwene 2001, 2007) have made explicit. 


3.1 The family tree model 


The family tree model assumes that any set of related languages descends from 
a single ancestor according to the parthenogenetic model of biology. Within this 
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Figure 2.1 The family tree model 


model, lineages may be represented by means of diagrams like Figure 2.1, where 
A represents the immediate ancestor of B, C, and D and the common ancestor 
of E, F, G, H, , and J, all of which can be said to be related. The classical family 
tree model assumes, following this parthenogenetic analogy, that there can be 
no special genetic relationship between, say, F and G other than their common 
descent from A. The model also assumes that influence on a language, even 
massive influence, cannot affect its genetic status, any more than external 
influence on a bacterium could, in older biological models, affect its status within 
its lineage. There is no linguistic feature or set of features which determine the 
genetic status of a language; rather, it is the circumstance of its birth that deter- 
mines this. In this way, anything that is borrowed from another language does 
not affect its genetic status. Further, the model supposes that splits (the birth of 
new languages) are always final and produce independent linguistic systems. 

This last point, that splits produce independent linguistic systems, is an import- 
ant component of the model and conforms to the parthenogenesis analogy. 
Within this model, a language is treated as an entity analogous to, for example, 
a bacterium in a line of parthenogenetic descent. It is divisible in the sense that 
it may, asexually, give “birth” to new languages, but it cannot “merge” with another 
language, it cannot engage in sexual reproduction (there is always a single, 
unique ancestor for any lineage), and it is not composed of “parts” that may merge 
or split in ways not consistent with the model generally. This last proviso con- 
cerns the status of dialects: their status within the model is exactly like that of 
languages. The model in Figure 2.1 could diagram the relations of dialects within 
a language as well as a set of related languages. 

The conceptual simplicity of the family tree classification schema follows, in 
part, from a set of basic assumptions which, taken together, make it possible, even 
necessary, to reject completely the effects of language contact in assessing genetic 
relationships. Indeed, one of the problems one often encounters in establishing 
genetic relationships according to the family tree model is the problem of stripping 
away the effects of contact so as to reveal the core of “native” material necessary 
for the comparative method. Contact is thus irrelevant for the determination of 
genetic relationships with the comparative method, though the effects of contact 
can prove an obstacle to its implementation.” 

This aspect of the family tree model follows from a literal interpretation of 
the unitary organism cum parthenogenetic reproduction analogy. A language is 
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composed of many linguistic constructs, but there is no threshold beyond which 
a language ceases to be a member of a lineage due to change of these linguistic 
constructs. That is, it is possible for a language to change all the constructs 
inherited from a remote ancestor and still stay within a lineage. For example, if 
in Figure 2.1, language E could be demonstrated to have descended from B, and 
B could be demonstrated to have descended from A, then it would follow that 
E is a descendant of A and within the family of languages defined by A even if 
A and E share no more linguistic constructs than any randomly selected pair of 
languages might share. So, membership within a lineage is not dependent on the 
possession of any particular feature or set of features, or even on the possession 
of any shared feature or set of features. It is based simply on the fact of common 
descent, no matter how this is determined. Nonetheless, in the usual course of 
things, common descent implies a certain number of shared features with other 
members of a lineage, and these common features are required by the compara- 
tive method for establishing membership within the lineage. 

It is important to emphasize here that acceptance of the family tree model 
of genetic relationship does not in itself preclude an appreciation of the role 
of language contact in the historical development of languages. Instead, what is 
implied by the model is that contact, along with other modes of language change, 
is irrelevant for genetic classification. Nonetheless, issues relating to contact situ- 
ations have formed the bases for criticisms of the model. We will consider several 
classes of such criticisms below. 

In sum, the family tree model of genetic relationships rests on a set of assump- 
tions that can be summarized as follows, all of which, in one way or another, 
follow from the unitary organism and parthenogenesis analogies: 


1 Languages are unitary systems: they are wholes, not entities defined by their 
parts (the unitary organism analogy). 

2 Two languages are genetically related if they descend from a single common 
ancestor (the parthenogenesis analogy). 

3 New languages can only be created by splitting off from an existing language 
(the parthenogenesis analogy). 

4 Linguistic splits are final and produce independent linguistic systems (the 
parthenogenesis analogy). 

5 No linguistic feature or set of features is required for genetic relationships to 
exist between two languages (though such features are required for establish- 
ing such relations)’* (the unitary organism analogy). 

6 Language contact is irrelevant for determining genetic relationships (the unitary 
organism and parthenogenesis analogies). 


Assumption (5) probably requires some additional comment. Shared features 
are required for the operation of the comparative method, but the comparative 
method is not in itself a component of the family tree model, but rather a 
methodology traditionally allied with it. As noted, the methodology for establishing 
genetic relationships utilized by Greenberg and his associates (Greenberg 2005; 
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Ruhlen 1994) is at odds with the comparative method, but is quite compatible 
with the family tree model. 


3.2. Alternatives to the family tree model 


The alternative models implied by the biological analogy have not received much 
attention, at least until recently. A unitary organism model employing sexual repro- 
duction has been notably absent from discussions of genetic relatedness, although 
Mufwene in a series of publications (e.g. 2001; 2007) has held that speciation 
(language splits) in the evolution of a language often come about via language 
contact, which could suggest a sort of sexual model of speciation, at least in some 
instances. Croft (2000) also discusses a sexual analogy in creation of mixed 
languages. Population models have been explored by various linguists in recent 
times — again by Croft (2000) and Mufwene (2001; 2007) — though the full conse- 
quences of models of this sort for language relatedness and our conceptualization 
of language generally have yet to be fully explored. In a population model, the 
gene pool, or its analog, could be considered variable, and new genetic material 
may be acquired by the species through hybridization, as well as by mutation 
and other means compatible with contemporary biological models. 

It should be noted that the models of genetic relatedness discussed above 
represent conceptually the simplest sorts of models: those based on the simplest 
analogies with the biological domain. One could, of course, propose more complex 
models. For example, many plants can reproduce both sexually and asexually and 
a model of genetic relatedness for languages could be based on the possibility of 
both sorts of reproduction, which might include principled reasons for deciding 
which sort of reproduction has taken place in any given instance. Croft (2000) 
notes this possibility. 

The one clear advantage of the family tree model over all the biologically inspired 
alternatives is that, at a macroscopic level, it provides a conceptually simple descrip- 
tion of what happens to families of languages, such as Indo-European, during the 
course of their evolution and where individual languages should be placed within 
the set of their known relatives. This simple classificatory system has undeniable 
appeal, and it is notably the case that linguists have attempted to replicate the 
apparent early success of Indo-European and Semitic linguistics in establishing 
family trees for all the other proposed language families. 


4 Genetic Classification and Language Contact 


In the sections that follow, I will discuss a number of situations involving language 
contact and see how these might be interpreted to affect genetic classification. 
4.1 Borrowing in the absence of speciation 


I will begin the survey with a discussion of a few basic instances of borrowing 
in the absence of speciation (birth of new languages) and where generational 
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transmission of a linguistic tradition is not disrupted. The central issue here is 
what effect borrowing under these conditions has on the genetic classification of 
languages. 

Borrowing resulting in the transfer of vocabulary items, or even long-term bilin- 
gual situations resulting in the transfer of syntactic constructions from one lan- 
guage to another, would have no effect on the genetic classification of a language 
if we agree that genetic classification is simply a record of the history of normal 
generational transmission of a linguistic tradition. In principle, this would be 
true even if the borrowing were massive and even if the language underwent 
metatypic change — that is, if the language changed from one morphosyntactic 
type to another. 

If we assume the essentialist approach, borrowing would have an effect on genetic 
classification only in the event of large-scale borrowing associated with the 
creation of new languages, i.e. in instances of speciation. Such instances will be 
discussed below. Where speciation is not involved, the essentialist approach 
allows for even massive borrowing without affecting genetic classification: in such 
cases, it is in complete agreement with the generational transmission approach. 

Most hybrid approaches also consider borrowing to affect genetic classification 
only in the event of speciation. It is difficult to find instances where contempor- 
ary scholars have claimed that a language changed genetic classification through 
borrowing without having undergone speciation, but some individual cases 
might be understood in this light. For example, Wexler (1991) has claimed that 
Eastern Yiddish is relexified Judeo-Sorbian; given that Yiddish is generally 
thought to be a Germanic language, this would appear to amount to a claim that 
Yiddish changed its genetic status without speciation, although Wexler’s (1991) 
specific claim is that Yiddish is really a Slavic language despite relexification and 
that Western and Eastern Yiddish are genetically unrelated, the former being a 
Germanic language. Wexler (2002), however, claims explictly that Modern Hebrew 
is relexified Yiddish, and that the two languages are “genetically related” (2202: 
3), but the creation of Modern Hebrew should probably be seen as a case of 
speciation, and hence it would fall outside the category of borrowing in the absence 
of speciation. This situation is discussed further in section 4.2. 

Most instances of languages that have borrowed so heavily that their genetic 
status would be somehow in dispute would probably qualify as mixed languages 
and also to be the product of speciation. One possible partial exception is 
Dongolawi, described by Heine and Kuteva (2001). In the course of its evolution, 
Dongolawi, a Nubian language, came under the influence of Nobiin, another Nubian 
language, though in a different branch of the family. The language retained much 
of its native vocabulary, but borrowed most of its grammar, including grammatical 
morphemes, from Nobiin. This seems not to have been a case of speciation, but 
rather of evolutionary change within a tradition of generational transmission. Heine 
and Kuteva claim (2001: 401) that modern Dongolawi is a “daughter” of both 
pre-contact Dongolawi and Nobiin: certainly the elements that the comparative 
method would use to establish the genetic affiliation for this language are mixed. 
The language can be viewed as a mixed language, but does not seem to have 
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undergone speciation into its mixed form. On the face of it, Heine and Kuteva’s 
claim that modern Dongolawi is a daughter of two languages is not consistent 
with the generational transmission or essentialist approaches, but rather supposes 
some sort of hybrid model. 


4.2 Substratic influence 


It is useful to separate out substratic influence from the cases of borrowing dis- 
cussed in the last section. By substratic influence I mean a situation whereby a 
language previously spoken by a community affects the language the commu- 
nity later comes to speak. For example, the English spoken in Ireland has been 
affected in a variety of ways by Irish, the language previously spoken by the 
population of the country. It has often been asserted that French is the product 
of Vulgar Latin with a Gaulish and probably Vasconic substratum. 

The reason for separating substratic influence from other instances of borrow- 
ing is that with substractic influence we have situations in which generational trans- 
mission of linguistic traditions is disrupted; we may also have speciation, though 
this is not necessarily the norm. In this section, I will discuss substratic influence 
in a general way; the effects of substratic influence in the languages traditionally 
designated as creoles will be discussed separately in section 4.4. 

Instances of substratic influence are not problematic for the generational trans- 
mission or essentialist approaches as long as there are some members of the com- 
munity who continue the generational transmission of the linguistic tradition, and, 
in the case of the essentialist approach, some “core” elements are included in the 
language of the new speakers. 

Nonetheless, we can imagine situations involving substrata that would be 
challenging for these two approaches, in particular the generational transmission 
approach. For example, suppose that an entire community decided to adopt a new 
language in the absence of any native speakers of that language. What would be 
the genetic affiliation of this new linguistic variety? We have a specific instance 
that fits this scenario in the case of Modern Hebrew. It was noted in the last 
section that Wexler (2002) has claimed that Yiddish and Modern Hebrew are genet- 
ically related since Yiddish formed the substratum for a revived Hebrew. Most 
scholars place Modern Hebrew in the Semitic family, and indeed the classical 
comparative method would unproblematically treat the language this way. 
Nonetheless, unbroken generational transmission did not take place in such a way 
that would link Modern Hebrew with the other Semitic languages. The essentialist 
approach, however, can be interpreted to fit the connection between Modern Hebrew 
and Classical Hebrew, and so the classification of Modern Hebrew as Semitic is 
basically essentialist. 


4.3. Koineization and the loss of autonomy 


Related linguistic varieties frequently come into contact, for instance in national 
institutional settings, as a result of migration or trade, and in colonial “tabula rasa” 
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situations, ie. where there were no varieties of the language spoken in the region 
before. One result of such contact may be the creation of a koiné. The original koine 
(the Koine) was a variety of Ancient Greek which had come to supplant other, 
local Greek dialects during the Hellenistic and Roman periods. All Modern 
Greek dialects, save one, are descendants of the Koine. The Koine was based mostly 
on the Athenian dialect, but included many elements from other dialects and 
involved a certain amount of simplification: the disappearance of irregularities in 
favor of structurally regular forms. 

The term koine has come to be used for any variety which supplants het- 
eronomous varieties and serves as a means of intercommunication between 
speakers of these varieties. This comes about as a result of dialect leveling, i.e. 
the loss of distinctive features in favor of features with a high degree of mutual 
intelligibility and/or high prestige. Sometimes this involves a fair amount of dialect 
mixture, though this needn’t be the case. Where dialect mixture is involved, the 
process of creating the koine can be referred to as koineization. Koineization has 
probably been a fairly common feature of the history of languages. For example, 
it seems to have operated at least twice in the history of the mainland 
Scandinavian varieties (Dahl 2001). The new koine may exist alongside all or some 
of the varieties that existed before koineization, or it may supplant them com- 
pletely, either regionally or everywhere the varieties were spoken. 

Koineization is not problematic for the approaches using the family tree model 
(generational transmission and essentialist approaches) as long as it can be main- 
tained that the koine is essentially a continuation of one of the original varieties, 
as in Figure 2.2a below, where the koine is a continuation of D and only the koine 
survives. Figure 2.2b, however, diagrams a scenario where the koine cannot 
be non-arbitrarily placed under any of the previously existing varieties because 
it incorporates too many features from more than one variety. Cases like this would 
be consistent with some hybrid approaches, but not with the family tree model. 
The question is, do we find real examples that are like Figure 2.2b? 

There do appear to be cases that fit this model. Trudgill (2004) discusses 
instances of dialect creation in colonial tabula rasa situations where it would be 
impossible in a non-arbitrary way to assign the koine resulting from the mixture 
of many dialect forms to any source dialect. A more interesting, if more unusual, 
case is the Romansch variety known as Rumantsch Grischun. This variety was 
created artificially by the Swiss linguist Heinrich Schmidt, who applied a statis- 
tical approach to the forms found in the surviving Romansch dialects. It was not 
intended to supplant the dialects, but rather to be used where there is a need for 
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Figure 2.2. Koineization in the family tree model 
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Figure 2.3 The remerging of an independent language 


a single variety intelligible to all (Haiman & Beninca 1992). In that role, it has 
achieved a fair amount of success and is now widely used in publications and 
official signs. 

Rumantsch Grischen is not obviously a descendant of any of the traditional 
dialects — it was deliberately designed not to be. It would appear, therefore, to fit 
the schema diagrammed in Figure 2.2b, except that it did not displace the other 
varieties. 

A related kind of situation is discussed by Dixon (1997). Okinawan had 
achieved the status of an independent language after 700 years of independent 
evolution from Japanese, and was the official language of the Ryukyu Kingdom. 
It has since remerged with Japanese after the formal annexation of the Ryukyu 
Kingdom by Japan in 1879. Speakers of the various Ryukyu dialects now con- 
sider their varieties to be dialects of Japanese, indicating that the language, which 
had achieved autonomy (in the sense of Trudgill 2000) as a state language, is now 
heteronomous with Japanese after many decades of intense pressure from the 
Japanese authorities. The loss of autonomy has been accompanied by a degree of 
linguistic convergence with Japanese. The situation as described would suggest 
a model like that in Figure 2.3. After the split, which produces an independent 
B, A and B come into contact again, with the result that A strongly influences B, 
and B loses its status and is incorporated back into A. In the case of Okinawan, 
A represents Japanese and B the various Ryukyu varieties, including Okinawan. 
Situations like this have probably been fairly frequent in the history of languages. 
For example, the Gallo-Romance varieties that produced French, Gascon, and 
Provencal have remerged back into French through a process like that described 
for Japanese and Okinawan. 

Situations that can be characterized by diagram Figure 2.3 are not compatible 
with the family tree model, which supposes that linguistic splits are final and can- 
not be undone. On the other hand, neither the generational transmission nor the 
essentialist approaches are incompatible with Figure 2.3. This is one of the few 
cases where these approaches and the family tree model make different predic- 
tions about possible developments. 


4.4 Creoles 


Creoles are usually defined as languages which develop from pidgins when the 
latter take on native speakers. In the early stages of its developement, a creole 
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typically has a lexifier language, a language which is the source of the great bulk 
of its vocabulary. The grammar of a creole is usually thought to be simpler than 
that of its lexifier: in fact some, like McWhorter (2001), claim to be able recognize 
creoles by their radically simple structure alone.” 

In a long tradition dating back to the early nineteenth century, creoles have 
typically been excluded from the family trees of their lexifiers. If one accepts the 
characterization of creoles that is enshrined in most introductions to linguistics 
and summarized in the paragraph above, the exclusion of, say, Jamaican Creole 
from the Germanic languages and Haitian Creole from the Romance languages 
follows from assumptions about the nature of genetic relatedness embodied in 
the two main approaches. 

Given the characterization above, the generational transmission approach would 
exclude creoles from the genetic lines of their lexifiers because these languages 
are not the products of regular generational transmission of a linguistic tradition 
given their origin in pidgins: a pidgin is, by definition, not a native language, so 
parents could not be transmitting to their children the linguistic tradition that they 
themselves acquired as children. The essentialist approach would exclude them 
because crucial core grammatical features are missing from the radically simplified 
structure of creoles because in the historical progression from lexifier language 
to pidgin to creole the grammatical essence of the lexifier language has been lost. 

Over the last few years, a number of scholars have challenged the narrative 
about the genesis of creoles summarized in the first paragraph of this section.” 
The new, revisionist narrative is the product of a line of research into the his- 
tories of creoles and the populations that speak them. The revisionist position is 
based on the idea that the creoles of the Atlantic and the Indian Oceans, the proto- 
types for this class of languages, did not develop from pidgins. Instead, most 
of these creoles initially developed via normal generational transmission among 
communities which included significant numbers of European native speakers (albeit 
mostly of nonstandard varieties) along with other peoples whose composition 
varied from place to place. That is, in origin, these varieties were essentially no 
different from colonial varieties generally. Where they came to differ from other 
colonial varieties, such as Brazilian Portuguese, Quebecois French, North American 
English, etc., has to do with subsequent history. For example, the proto-creoles 
in regions that experienced the rise of large-scale plantation culture came to be 
spoken by large numbers of new immigrants, whose languages formed substrata 
which influenced the subsequent development of the languages, and since these 
new immigrants were mostly slaves, who experienced increasing segregation from 
other native speakers of the colonial languages, the proto-creoles were socially 
isolated and developed along different paths from other varieties of the language. 

The sketch presented in the last paragraph is a simplification of the revisionist 
view, and the reader is encouraged to consult the references provided.” However, 
given our limited objectives here, it will suffice and it remains to be seen how 
acceptance of the revisionist model would affect views of the genetic status of 
these languages. In the generational transmission approach, creoles would now 
be seen as legitimate offspring of metropolitan languages: Jamaican Creole is a 
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Germanic language and Haitian Creole is a Romance language. This would follow 
because these languages were the product of regular generation transmission by 
at least some components of the speaker base at each stage in their development. 

Within the essentialist approach, creoles would also now be seen as descendants 
of the metropolitan languages. While many grammatical features of the standard 
versions of the metropolitan languages may be absent, their absence was not abrupt 
and resulted from internal changes, perhaps spurred on by substratic influence 
and even influence from other creole varieties. 


4.5 Mixed languages 


Mixed languages have attracted a good deal of attention over the last two 
decades.” Versteegh (2007) points out that there may not be a linguistically valid 
category of mixed languages, but I will assume here that there is for the sake of 
discussing how such languages might be dealt with from the standpoint of 
genetic linguistics. 

In some respects, mixed languages are related to mixed code varieties found 
in bilingual situations. The primary difference is that mixed languages have 
achieved autonomy as linguistic systems and some degree of stability in the sense 
that the variability found in mixed code varieties is considerably reduced. It has 
been claimed (e.g. Winford 2003; Dixon 1997) that mixed languages arise only under 
a specific set of circumstances, namely where bilingual communities feel the need 
for a distinctive, in-group language and create a stable mixed-code variety for this 
purpose. Languages like Michif and Copper Island Aleut fit this model. In these 
special mixed languages, there is little or no simplification of the two components 
of the language because the population producing the new system is fully com- 
petent in both. 

The generational transmission approach runs into the interesting problem that, 
if the population is bilingual from childhood, as the originators of Michif and Copper 
Island Aleut probably were, then both systems are transmitted normally, and the 
mixed language, being a product of this normal generational transmission, has 
in a real sense two genetic parent languages. The essentialist model, which looks 
to the transmission of core grammatical forms, would find, in the case of Michif, 
that the Noun Phrase is fully French and the Verb Phrase is fully Cree: the essence 
of these systems seems to have been transmitted, but only in a component of the 
grammar. Mixed languages have no place in the family tree model. Hybrid 
approaches, needless to say, would find such situations less paradoxical. 


5 Language Contact and Speciation 


Speciation in the sense used here, i.e. the creation of new languages, is a complex 
issue in several respects. One complication is the language-—dialect problem in 
cases of dialect continua: at what point should historic varieties of a language be 
considered separate languages? Needless to say, mutual intelligibility is a factor, 
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but hardly a criterial one: there are many instances of languages with mutually 
unintelligible dialects. Similarly, varieties that are mutually intelligible may be 
considered separate languages. In the end, sociocultural attitudes are the deter- 
mining factors in such cases. 

The problem which concerns us here is the degree to which language contact 
may result in speciation. Language contact can induce change in the language of 
various sorts, through simple mechanisms like borrowing, but also through more 
complex ones resulting from extensive bilingualism and/or substratic influence. 
Change arising from any or all of these mechanisms may have unequal con- 
sequences for different varieties of a language, resulting in reductions in mutual 
comprehensibility. This will increase the likelihood of speciation, though in 
itself if does not cause it. In the end, speciation remains largely a matter of social 
attitudes and purely linguistic considerations are only a factor, though not a 
negligible one. 


6 Final Thoughts 


How one sees the consequences of language contact affecting genetic relations 
depends on one’s adherence to particular approaches to the nature of genetic rela- 
tionships and models of remote relations. These are bound to evolve as linguists 
become more familiar with the histories of the various sorts of contact situations. 
We already see the emergence of new sorts of models in the works of Croft and 
Mufwene. We can expect to see further developments along those lines. 


NOTES 


I would like to thank Edith Moravcsik for helpful discussions about the issues discussed 
in this paper. 

1 The contemporary disassociation of the genetics of people and the genetics of the 
languages they speak did not characterize the early nineteenth-century Romantics 
who created the family tree model which dominates modern conceptions of genetic 
relationships among languages. For them, there was a straightforward connection 
between culture, language, and people (Thom 1995). It was only later that tracing the 
origins of people and tracing the origins of languages came to be considered separate 
subjects of inquiry, though interestingly there has been renewed interest in this con- 
nection in recent times, for example in Cavalli-Sforza et al. (1988), Cavalli-Sforza (2000), 
Ruhlen (1994). 

2 As Robins (1990: 186-7) points out, the eighteenth-century view of genetic relations 
among languages was largely typological in the modern sense. The Encyclopédistes did 
not consider French a descendant of Latin because the grammars were so different; 
instead, they considered French to be a continuation of Gaulish, which had adopted 
a Latin vocabulary. Similarly, Sir William Jones, who famously described a kinship 
relation between Sanskrit, Latin, and Greek, did not regard Hindi as a descendant of 
Sanskrit because of their very different grammars. 
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Knowing that Irish and Hindi are Indo-European languages tells us almost nothing 
about the structures of the two languages; knowing that Hindi is South Asian tells us 
a good deal more. 

In assessing genetic relatedness, it is not important that all instances of generational 
transmission follow this model since a language can acquire new adult speakers. It is 
necessary only that there be some instances of unbroken transmission in the linguistic 
community. 

It’s worth emphasizing that the comparative method is not in itself a model of genetic 
relatedness, but rather a technique for demonstrating it. It’s not the only such tech- 
nique currently in use: for example, Greenberg and his associates (Greenberg 2005; 
Ruhlen 1994), controversially, have used a technique at odds with the comparative 
method for assessing genetic relationships. See Campbell (1997, 2003) for critiques of 
this approach and other contemporary alternatives to the comparative method. 
Some of the scholars listed are less than clear about whether they believe that in 
principle there are hybrids or whether under some circumstances it is impossible to 
say what taxonomic unit a language belongs to. Others explicitly or implicitly allow 
hybridity, e.g. Benedict, Croft, Laks, and Holm. 

“T remain doubtful, however, of the possibility of successfully reconstructing Chinese 
linguistic history strictly from the evidence of modern dialects by the traditional com- 
parative method as applied to languages without a written tradition. A major difficulty 
is that the Stammbaum or branching-tree model that is implied by the traditional com- 
parative method is totally unrealistic in the case of Chinese” (Pulleyblank 1998: 200). 
So, for example, it might not be possible to classify Cantonese within a taxon at the 
level of other Chinese languages, but it can still be placed within Sinitic. 

In fact, genetic classifications of languages have served many purposes within 
and outside linguistics, including various intellectual and even political causes. For 
example, awareness of the classification of Rumanian as a Romance language ultimately 
affected its writing system (from Cyrillic to Roman) and encouraged the movement 
to purge the language of Slavic and other non-Italic elements. 

The relationship between linguistics and biology was not simply a one-way expro- 
priation of ideas. Nettle (1999: 4) points out that the success of Indo-Europeanists in 
the nineteenth century in charting the histories of languages encouraged Darwin in 
his development of the theory of evolution. 

Croft (2000) and Mufwene (2001, 2007) are notable exceptions. 

In the nineteenth century, the doctrine of “biological naturalism” in linguistics did make 
explicit the connection between languages and living organisms. Ivi¢é (1965), in her 
history of linguistics, says the following about August Schleicher, an early and very 
influential comparativist “His [Schleicher’s] method grew out of his conception that 
language was a living organism, independent of man, whose line of development 
was determined by the general biological laws of evolution: a language is born, lives 
for a certain time, gives life to another, younger language which in time replaces it, 
in its turn to be continued by one of its own offshoots; this language, like man, has a 
‘genealogical tree’, i.e. a common ancestor from which numerous related progeny have 
developed as branches of the tree (hence Schleicher’s theory is called the theory of 
biological naturalism in linguistics, and is known under the name of the ‘stammbaum’, 
or ‘pedigree’ theory)” (p. 44). 

By linguistic construct I mean any unit or construction within a language. 

Croft (2000), for example, views languages as populations of utterances produced by 
a set of communicating speakers. 
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Mufwene (2001) argues that languages should be conceptualized in a way analogous 
to species. 

See note 5 and the references cited there for discussion. It suffices here to say that 
while the comparative method and the various alternative approaches differ in how 
relationships may be established, they suppose an identical model of how languages 
may be related. It is the latter issue that concerns us here. 

Campbell (2004: 212) notes that “there is no provision in the comparative method for 
dealing directly with borrowings.” 

The essentialist approach to genetic relatedness discussed in section 2 claims only that 
certain features must be preserved in immediate mother-daughter relationships, but 
not in more remote relationships. 

An entire issue of Linguistic Typology (volume 5.2/3, 2001) was devoted to the issue 
of whether creoles constitute a special, and especially simple, structural type. 

See, for example, Mufwene (2002; 2007), Chaudenson (2001), and Ansaldo, Matthews, 
and Lim (2007). My discussion is based largely on Mufwene. 

The authors of these works suggest that there may not be a linguistically significant 
category of creole languages, and what the languages so labeled have in common is 
their development during a particular period of world history and the fact that they 
are spoken by people of non-European descent. 

Useful information and analysis can be found in Thomason and Kaufman (1988), Bakker 


and Mous (1994), Winford (2003), and Mous (2003). 
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3. Contact, Convergence, 
and Typology 


YARON MATRAS 


1 A Definition of Contact 


Although Weinreich (1953), the pioneer of language contact studies, remarked 
that the true locus of language contact is the bilingual individual, much of the 
contemporary research in the field is guided by the assumption that language 
contact is about the way in which linguistic systems influence one another. 
Contact-induced language change is consequently seen as change that is “exter- 
nal” to the language system. I follow an approach to language contact that is 
based on a view of language as the practice of communicative interaction, and 
of grammatical categories as triggers of language processing tasks. According 
to this approach, the speaker’s choice of structures and forms matches the 
linguistic task-schema that the speaker wishes to carry out. This, in turn, is 
subordinate to the goal-oriented activity that the speaker pursues through 
verbal communication in discourse. My assumption is that bilinguals - whether 
“balanced” or “fluent” bilinguals, or “secondary,” “late,” or “partial” bilinguals 
— do not, in fact, organize their communication in the form of two “languages” 
or “linguistic systems.” 

Rather, bilinguals have an enriched and extended repertoire of linguistic struc- 
tures at their disposal. As part of their linguistic socialization, they learn which 
word form, construction, or prosody pattern is appropriate in a specific context 
of interaction. Some contexts allow greater flexibility of choices. These are the 
contexts in which bilinguals can make the most effective use of their full reper- 
toire, exploiting nuances as well as contrasts between variants of equivalent or 
near-equivalent meaning (cf. Grosjean’s 2001 notion of “bilingual mode”). Other 
sets of contexts are more exclusive with regard to the selection of structures within 
the repertoire. 

Rules governing the selection of context-appropriate structures form part of bilin- 
guals’ communicative competence. They operate on the basis of established asso- 
ciations between a subset of structures and a set of interaction contexts. As a society 
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we draw on this association to construct our notion of a “language” or “language 
system.” Bilingual children become exposed to this socially constructed narrative 
line — the idea that they speak two “languages” — around the age of 2.6-3. Until 
then, their use of word forms and constructions is governed by a prolonged pro- 
cess of trial and error that is usually unaccompanied by any explicit analytical 
labeling of the elements of their repertoire. 

An association between structure and a specific set of interaction contexts does 
not necessarily exist for each and every item in the linguistic repertoire. 
German-English bilinguals, for example, accept that their repertoire contains 
only one single word form for concepts such as “internet,” “download,” 
“computer” (subject of course to embedding in different phonological and 
morphosyntactic environments). Speakers of the Jerusalem variety of Domari, 
an endangered New Indo-Aryan language spoken by small and dispersed 
communities in the Middle East, take for granted that their repertoire of 
linguistic structures contains only a single set of conjunctions, interjections, 
focus particles, and discourse markers, which are all shared by Domari and 
its contact language, Arabic. In historical-descriptive terms, German has “borrowed” 
the words Internet, Computer, and downloaden from English, and Domari has 
“borrowed” its entire set of conjunctions and discourse particles from Arabic in 
a situation of “contact,” where at least some speakers alternated between the use 
of two languages. 

In the model pursued here, language contact phenomena or “borrowings” are 
regarded as the outcome of function-driven choices in which speakers license them- 
selves, while interacting in one set of contexts, to employ a structure (word form, 
construction, meaning, phonological features, etc.), despite its original associ- 
ation with a different set of interaction contexts. When claiming that choices are 
function-driven, I am not suggesting that the selection of structures is necessarily 
conscious, deliberate, or strategic. I propose that contact phenomena are arranged 
on a continuum, from those that are not at all voluntary (e.g. phenomena known 
as “interference” or “transfer,” or errors in the selection of the appropriate language 
form), indeed even counter-strategic in their origin (cf. Matras 2000 and 2007a), 
to those that are conscious and deliberate (such as language mixing for stylistic 
purposes). All, however, are functional in the sense that they are the product of 
language processing in goal-oriented communicative interaction. The susceptibility 
of certain structural categories to contact-related change is therefore not accidental, 
but inherently bound with the task-oriented function that those categories have, 
ie. with the way they support language processing in discourse. Contact phenomena 
are in this respect seen as enabling rather than as interfering with communicative 
activity. 

I shall be following this perspective in the next sections, where I take up two 
main issues: The first is the process that is captured by the notion of “convergence,” 
its functionality, and its potential effects on the typological profile of languages. 
The second is a typological-universalist approach to structural borrowing in 
language contact situations. 
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2 Convergence 


2.1 Matter and pattern 


A very broad interpretation of the term “language convergence” might imply an 
increase in similarities between two languages at any level: lexical, phonological, 
typological (cf. Silva-Corvalan 1994: 4—5). In practice, the term tends to be used 
most often with reference to the kind of constructions that are best described as 
linguistic patterns, i.e. those that involve a specific mapping relation of meaning 
to form, or a structural relation among two or more word forms, expressed for 
instance through their position. This distinguishes patterns from linguistic matter, 
which is the concrete phonological shape of morphemes and word- forms (see 
Matras 2009; Matras & Sakel 2007). 

This distinction is well established in the literature on language contact. 
Haugen (1950) speaks of “calques,” a term that has received wide circulation, 
and Weinreich (1953) speaks of “convergent development” to describe a change 
in the function of morphemes that takes place in a “replica language,” inspired 
by a “model language” (see also Heath 1984: 367 for the term “pattern transfer”). 
Myers-Scotton (2006: 271) describes convergence as a combination of surface-level 
forms from one language, with an underlying abstract lexical structure from another 
language (cf. also Bolonyai 1998). Discussing a Melanesian case study, Ross 
(1996, 2001) coins the term “metatypy” to denote the sharing of organizational 
structures across languages in a situation where social attitudes disfavor the 
replication of concrete word forms whose origin in another language is easily 
identifiable. 


2.2 Convergence and grammaticalization theory 


Growing interest in functionalist explanations of language change and the rise of 
grammaticalization theory (e.g. Heine, Claudi, & Htinnemeyer 1991; Hopper & 
Traugott 1993) had an impact on the study of language convergence. In particular, 
language typologists identified contact as a potential trigger for typological 
change and sought to apply functional-typological models to explain this kind of 
change. Many contact linguists, too, recognized that grammaticalization was 
involved in many of the structural processes of change observed in contact situ- 
ations. The development of operational structures in creole languages, for example, 
can be described largely as a process of grammaticalization of lexical material from 
the lexifier language, giving rise to a new approach to the relationship between 
the source or lexifier language and creoles (cf. Keesing 1991; Bruyn 1996; Plag 2002; 
Heine 2005; and see already Givén 1982). With reference to Southeast Asia, 
Bisang (1996; 1998) regards the sharing of grammaticalization pathways among 
languages as the key factor behind the emergence of areal linguistic similarities 
and so as a key approach in what has become known as “areal typology”: the 
study of the typological features that are shared by geographically contiguous 
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languages (cf. Dahl & Koptjevskaja-Tamm 2001; Ramat & Stolz 2002; Ramat & 
Roma 2007; Muysken 2000). 

Overall, the discussion of contact-induced grammaticalization addresses sev- 
eral issues, among them the speaker’s motivation to engage in acts of linguistic 
innovation, and constraints on the directionality of the process. In Matras (1994: 
67, 241-3; 1998b) it was suggested that bilingual speakers benefit from being 
able to syncretize the mental planning operations applied while interacting in 
each of the languages. This allows effective exploitation of the full linguistic 
repertoire, on the one hand, while still complying with the constraints of context- 
appropriate selection of overt word forms, on the other. In order to achieve 
this, bilinguals exploit the meanings and functions of inherited structures and 
enhance them to carry out organization procedures that are replicated from the 
model language. 

Haase (1991) similarly notes that bilingual speakers are motivated to avail them- 
selves of the expressive means of both their languages and thus wish to have equal 
constructions at their disposal in each language, but they can only do so if they 
are able to identify parallel items in the two languages as translation equivalents. 
This means that the grammaticalization process begins by matching lexemes to 
one another and adapting the range of meanings expressed by the lexemes of the 
replica language to those expressed by the parallel lexemes in the model. The basis 
for the matching procedure is the polysemy of the word in the model: Usually 
the model word has both a concrete meaning, and a more abstract one. Consider 
for instance the English word up, and its “concrete” spatial-locational meaning, 
alongside the counterpart expression in shut up, where up takes on the function 
of an abstract modifier. The process of grammaticalization therefore proceeds along 
a hierarchical scale from more concrete, lexical meanings to the more abstract, 
grammatical functions (cf. Nau 1995: 175-6; Haase 1991: 169), a property that 
has been referred to as the “unidirectionality” of the grammaticalization process 
(cf. Haspelmath 1999). 

Heine and Kuteva (2003; 2005) base their model of contact-induced grammat- 
icalization on the notion of a mental comparison between a model and a replica 
language, as a result of which a construction is identified in the replica with the 
potential to carry the same meaning as the target construction in the model. The 
candidate construction is then grammaticalized in order to take on the meaning 
conveyed in the model. The concrete changes may involve expansion of a con- 
struction from minor to major use patterns including an increase in frequency, 
extension of its distributional context, extension across categories, and the emer- 
gence of new categories. The unidirectionality of grammaticalization is manifested 
in the emergence of novel meanings, semantic bleaching or blurring of existing 
meanings as lexemes take on more abstract grammatical functions, loss of mor- 
phosyntactic properties that are associated primarily with the content lexeme (as 
in the case of nouns that are grammaticalized into location expressions, or inter- 
rogatives that are used as subordinators), and possibly also through an erosion 
or reduction of phonetic substance. 
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2.3. Pattern replication and pivot matching 


Replication of patterns might be viewed as a kind of compromise strategy 
that allows speakers to continue to flag language loyalty through a more or less 
rigid choice of word forms, and at the same time to reduce the load on the 
selection mechanism of linguistic structures by allowing patterns to converge, thus 
maximizing the efficiency of speech production in a bilingual situation. Consider 
example (1), from a German native speaker who is giving an interview for British 
television: 


(1) a. At the border in England, were by the custom/ 
b. They have investigated this car very very eh/ eh/ thoroughly and they 
have removed the panels from the doors, the panels from the luggage room, 
c. and they in/ investigated in the engine compartments aber they didn’t 
find anything, 
d. but the/ they have forgotten to got unten/ the/ [clears throat] they 
forgot to look under the car. 


As an indication of the speaker’s proficiency in English, note his use of some 
rather elaborate vocabulary, such as the words investigated, removed, thoroughly, 
or compartments. He also speaks fast, and in well-constructed sentences. But 
notice nevertheless how his native German influences his English speech. In 
line (a) the speaker follows German word order rules on the positioning of the 
finite verb (were) in second constituent position, following the prepositional 
phrase at the border in England. This verb is also used lexically in a way that resem- 
bles German, to express existence, whereas the normal English equivalent would 
have been there were. Following German usage, the speaker prefers the perfect 
tense to express past-tense events, whereas in English the more obvious choice 
for events that have no direct bearing on the present situation is the simple past: 
they have investigated, they have removed (b), they have forgotten (d). The speaker 
refers to the boot of the car as the luggage room (b), constructing the expression 
following the German model: Kofferraum, where -raum (similar to English room) 
actually means ‘space’. 

These idiosyncratic structures are normally perceived as cases of “interference” 
or “negative transfer” from the speaker’s native language. In fact, they constitute 
instances of the speaker availing himself simultaneously of components from both 
subsets of his repertoire: generalizing certain word order patterns, tense-aspect 
categories, and word-formation patterns across the various contexts of interaction, 
while selecting context-appropriate word forms (in most cases, with the exception 
of German aber ‘but’ in line c). These combinations are creative usages that enable 
the speaker to communicate even in the absence of full grammatical competence 
in English. While in this particular case they are unlikely to lead to language change, 
they are of precisely the same nature as those processes which, repeated and 
accepted by a collective of speakers and propagated across a speech community 
over a period of time, may indeed lead to contact-induced change. 
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The following example from a German-English bilingual child demonstrates 
the same mechanism at work again: 


(2) German; age 6:0, addressing both parents, commenting on their conversa- 
tion (which is conducted in German): 
Was redet ihr liber? 
what talk.2PL you.PL about 
‘What are you talking about?’ 
German: Woriiber/Uber was redet ihr? 


The child’s construction in (2) is an import of the English preposition stranding 
construction, but note that English does not, in fact, provide the full blueprint for 
the sentence. Rather, the use of German tense and question-phrase word order is 
correct, deviating from that of English. The construction is thus a blend, a hybrid: 
The speaker makes use of a key feature of the English model construction, while 
at the same time adapting it to German by complying with various rules of German 
morphosyntax, as well as using exclusively German word forms in overall com- 
pliance with the expectations of the ongoing interaction context (a conversation 
in German among the interlocutors). 

A model is therefore required that can explain convergence in light of two addi- 
tional properties (cf. Matras & Sakel 2007). First, the tendency of pattern replication 
from the model to comply not just with the norms on selection of replica-language 
word form, but also with other morphosyntactic constraints of the replica language. 
And second, the potential occurrence of pattern replication as a spontaneous 
production, and not just at the end of a gradual process of context- or meaning- 
extension as predicted by grammaticalization theory. In addition, a model of 
convergence must also be able to account for potential exceptions to the uni- 
directionality of grammaticalization, such as the loss of categories (e.g. the loss 
of the definite article in Romani dialects in contact with languages like Russian 
and Polish). 

Figure 3.1 depicts the stages in the process of “pivot-matching,” which offers 
an alternative model of convergence. The point of departure is the speaker’s aim 
to pursue a particular communicative goal, embedded into a particular commu- 
nicative context. This is transposed into a concrete linguistic task for which an 
appropriate task-schema (see Green 1998) needs to be assembled from within 
the linguistic repertoire. Scanning through the entire repertoire, the speaker 
identifies a construction that would serve this particular task most effectively. We 
assume that, when scanning the repertoire, the speaker has the entire repertoire 
at his or her disposal, and does not “block” or “de-activate” any particular 
language “system.” But the speaker is also conscious of the need to meet certain 
expectations of the interlocutor in respect of the choice of word forms. 

We assume that the optimal construction that was identified does not have an 
established structural representation that is appropriate for the present context. 
The speaker therefore tries to optimize communicative efficiency by combining the 
selected construction with context-appropriate word forms. In order to do this, the 
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Figure 3.1 Pivot-matching in pattern replication (from Matras 2009, ch. 9) 


speaker de-constructs the construction by isolating its pivotal features, such as 
the preposition-stranding feature in example (2). This construction “pivot” is then 
matched to the inventory of context-appropriate forms. This inventory includes 
not just word forms, but also their formation and combination rules, leading the 
speaker in (2) to employ German tense and word-order rules. The outcome is a 
creative, innovative construction that is both task-effective and, seemingly at least, 
context-appropriate. 

Such creativity has the potential of increasing and enriching the inventory of 
constructions that speakers have at their disposal in a given set of interaction 
contexts — i.e. in a given “language.” But there is also the risk of misjudging 
the acceptability of new constructions to interlocutors. Interlocutors’ reactions are 
therefore crucial to the chances of a new construction to be genuinely effective, 
to be accepted, to be used by the speaker again, and to be replicated by others 
and so eventually lead to language change. Innovations introduced by single 
second-language learners are unlikely to be propagated, while in a situation of 
collective language learning, especially one where speakers have limited access 
to the target language, such as pidginization processes, innovations are more likely 
to prevail. Similarly, lax normative control by a parental generation or community 
institutions, e.g. in situations of creole formation, among immigrant communi- 
ties, or among some multicultural communities with open and flexible attitudes 
toward community boundaries and identity, is more likely to allow spontaneous 
innovations by speakers to be received favorably and possibly propagated to result 
in change. This — lax normative attitudes in a multilingual community with flexible 
identity boundaries — is the likely scenario behind the emergence of many of the 
world’s so-called “linguistic areas.” 
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3 Convergence and Typological Change 


Convergence (pattern replication) of the kind described above can lead to a 
wholesale re-adjustment of the morphosyntactic patterns of a language in a given 
domain of grammar and so to a shift in type, or typological drift. In Matras (1994) 
I described how Romani adopted clause-combining strategies that were common 
in Byzantine Greek and other contact languages in the Balkan area, at the expense 
of what will have been its New Indo-Aryan legacy of (co-relative and nominal) 
constructions. The emergence of a uniform type of complementation in contigu- 
ous languages via a process of pattern replication can be illustrated both for Romani 
and for another language of the Balkans, the dialect of Turkish as spoken in 
Macedonia: 


(3) a. Macedonian Turkish: 
(o) istiyor git-sin 
3SG_ ~—want.3SG go-3SG.SUBJ 
b. Macedonian: 
toj sak-a da id-e 


3SG_ —_want-3SG COMP go-3SG 
c. Romani (Balkans): 

ov mang-el-a te dza-l 

3SG.M want-3SG-IND COMP go-3SG.SUBJ 
d. Greek: 

(aftés) thel-i na pa-i 

3SG_ —_want-3SG COMP go-3SG 


‘He wants to go.’ 


Macedonian Turkish replicates the Macedonian model construction, replacing the 
inherited Turkish infinitive and postposed modal verb (git-mek istiyor) by a finite, 
postposed complement clause. Note however that the Macedonian and the 
Macedonian Turkish constructions are not isomorphic: Macedonian makes use of 
a subjunctive complementizer to introduce the nonfactual complement clause, while 
the finite verb shows no distinction for mood. The Macedonian Turkish construction 
is based on the subjunctive (historical optative) inflection of the verb, with no com- 
plementizer. The “pivotal” feature that is replicated is thus the order of constituent 
clauses — a main matrix clause followed by a complement clause — and the sub- 
junctive marking of the complement clause. The means of achieving this marking 
are language specific, and draw on language-specific resources and constraints. 
The historical change involves an extension of the meaning and environment of 
the historical, semantically conditioned optative, to serve as a syntactically con- 
ditioned subjunctive. Romani (3c) generally replicates the Greek construction, itself 
a close parallel to the Macedonian one. It introduces a subjunctive complementizer 
by drawing on an inherited correlative particle “te (cf. Hindi to), exploiting it in 
a function that matches that of the Greek subjunctive complementizer na. We thus 
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witness the emergence of a linguistic area through a kind of chain development 
or series of independent language-internal grammaticalization processes, each trig- 
gered by contact among distinct pairs of languages. 

Some convergence developments involve much more subtle shifts in the 
mapping of form and meaning. Khuzistani Arabic undergoes a reanalysis of the 
morphology of its attributive construction — involving both nouns as attributes 
(genitive construction), and adjectives (see Matras & Shabibi 2007). In Arabic, 
adjectival attributes follow the head noun and agree with the head noun in 
gender, number, as well as in definiteness (4a). Nominal attributes, by contrast, 
are conjoined by means of the attributive Idafa-construction, whereby only the depen- 
dent (genitive) noun is overtly marked for definiteness (4b): 


(4) Standard Arabic (and other dialects): 
a. l-walad_ I-kabir 
DEF-boy DEF-big.m 
‘The big boy’ 
b. walad I-mudir 
boy DEF-director 
‘The director’s son’ 


In Persian, both types of attributes are treated in the same way: the attribute (whether 
adjectival or nominal) follows the head, and an attributive particle mediates between 
the two: 


(5) Persian: 
a. pesar-e bozorg 
boy-ATT big 
‘The big boy’ 
b. pesar-e modir 
boy-ATT director 
‘The director’s son’ 


The pattern in Khuzistani Arabic matches the Persian arrangement (note that, as 
in other dialects of Arabic, the definite article /- assimilates to dental consonants, 
resulting in gemination of that consonant): 


(6) Khuzistani Arabic: 
a. walad ¢-cibir 
boy DEF-big.m 
‘The big boy’ 
b.  walad I-modir 
boy DEF-director 
‘The director’s son.’ 


The key to understanding the change is the function and the position of the definite 
article in the nominal attribution in Arabic (4b), which resembles the function and 
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Table 3.1 Layout of the present-tense finite verb in languages of East Anatolia 


‘I see’ present indicative ‘T see’ present subjunctive 


ASPECT ROOT PERSON ASPECT ROOT PERSON 


Turoyo Aramaic ko- -doz- -eno/ono @ -doz- -eno/ono 
Kurmanji Kurdish — di- -bin- -im 2@/bi- -bin- -im 
Persian mi- -bin- = -aem (o)/be- -bin- -zem 
Western Armenian = ga- -desn- -em @ -desn- -em 
Levantine Arabic b- -a- -suf- [zero] @ -a- -suf -[zero] 


the position of the attributive particle in nominative attributions in Persian. The 
Persian attributive particle is interpreted as the pivot of the Persian attributive 
construction, both nominal and adjectival. The Arabic definite article becomes 
associated with the Persian attributive particle due to the similarities in their struc- 
tures in nominal attributions. It is then extended to match the Persian attributive 
particle in the adjectival attribution, resulting in a loss of the Arabic definiteness 
agreement in adjectival attributions and in a shift in meaning of the definite 
article itself. The conflation of the two constructions is, once again, an interesting 
challenge to the unidirectionality hypothesis in grammaticalization theory, which 
normally predicts that extension (of meaning or of distribution context) will lead 
to the emergence of new categories and more differentiation will emerge. 

Convergence can affect inflectional paradigms, too. In the case of the linguistic 
area of eastern Anatolia, contact has led to shared grammaticalization pathways 
in the development of aspect/mood prefixes. The languages involved — Persian, 
Kurdish, Armenian, Neo-Aramaic, and Levantine Arabic — all have a progressive- 
indicative aspectual prefix, usually derived from a preposition indicating location 
or similarity. The subjunctive is marked either by the absence of the progressive- 
indicative prefix, or by a specialized subjunctive prefix (Table 3.1). 

A further pathway for morphological convergence is through reanalysis and 
leveling of functions within the paradigm of the replica language. Fertek Greek 
from the Cappadocian region in central Anatolia adopts the Turkish agglutinative 
arrangement of case markers, drawing on its own inherited system of morphemes 
(Dawkins 1916: 113-14). While Greek has a declension-sensitive inflectional system, 
and a single morpheme may integrate several meanings/ functions (e.g. GEN.SG.F.), 
Fertek Greek moves toward an agglutinating type in reducing the meanings of 
each morpheme: Thus -yu (originally M.SG.GEN) marks exclusively the genitive, 
independently of gender and number, like Turkish -7n in the model. It can be 
combined with other markers of singularity or plurality into a layered case 
structure, as in the Turkish model (Table 3.2). 

These few examples illustrate the potential of pattern replication to bring 
about major changes in the morphosyntactic typology of a language — from the 
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Table 3.2 Genitive case marking in Fertek Greek (based on Dawkins 1916) 


Greek Fertek Greek Turkish 
‘wife’ yinék-a nék-a kadin 
‘wives’ yinék-es nék-es kadin-lar 
‘of the wife’ yinék-as nék-a-yu kadin-in 
‘of the wives’ yinék-on nék-es-yu kadin-lar-in 


level of representation of just a single construction such as attribution, to the prin- 
ciples of clause combining, tense-aspect representation, and the representation of 
nominal case. In a sense, the potential impact of convergence of this kind is even 
“deeper” or more far-reaching than that of matter replication or the borrowing 
of overt shapes of morphs and word forms: Items such as bound tense-aspect 
markers and case affixes are very rarely borrowed directly from one language to 
another, nor are definite articles frequent candidates for direct (matter) replica- 
tion. Of course, structures such as word order or the blueprint for clause com- 
bining are by their very nature only replicable as patterns. Arguably, pattern 
replication is more far-reaching in its potential because speakers are able to 
reconcile a radical and thorough shift in the way meanings are mapped onto forms 
and in the way word forms are organized at the phrase and sentence level, and 
morphemes are organized at the word level, with holding on to an inventory 
of word forms that are representative of their community language and so of 
their identity. As stated in the opening remarks, convergence offers speakers 
the opportunity to accommodate and generalize and yet still hold on to a men- 
tal demarcation between subsets of word forms within their repertoire. This, the 
compromise between form-structure continuity and organizational adaptation, is 
what makes almost each and every structure of language potentially vulnerable 
to convergence in situations of contact and multilingualism. 


4 Typology and Generalizations on 
Contact-Induced Change 


4.1 Types of generalizations 


In this final section I examine the typology of borrowing (matter replication), and 
attempt to generalize, predict, and assess the universality of borrowing. Much of 
the discussion on borrowing has chosen to focus on constraints (cf. Moravcsik 
1978) and in turn the opportunities to demonstrate counter-examples to proposed 
constraints and so render proposed generalizations invalid (cf. Campbell 1993). 
I am concerned here not with the postulation of absolute predictions concerning 
which structures can or cannot be borrowed; it is firstly an empirical fact that 
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examples, however isolated, of borrowings have been cited for almost each and 
every type of grammatical morpheme. Moreover, the postulation of a negative 
(“no X can ever be borrowed”) carries with it obvious epistemological risks. 

Rather, I am concerned with an assessment of empirical data that show some 
clear tendencies in the area of borrowing, tendencies that merit our attention and 
justify a number of generalizations. These generalizations, in turn, demand an 
explanatory account, one that at the very least supplies a reasonable hypothesis 
as to why the borrowing behavior of certain functional categories should differ, 
more often than not, from that of others (all other conditions being equal). The 
empirical data on which the discussion will draw includes recent sampling in 
contact linguistics, which has shed new light on insights which previously had 
been based largely on chance observation. 


4.2 Structural properties that favor borrowing 


Early interest in structural borrowing within linguistic typology focused on the 
hierarchical relationship among structures in respect of ease of borrowing. Under 
“ease” of borrowing we understand the likelihood of a structure type to be 
borrowed. This likelihood can be measured in two ways. First, by the frequency 
with which a structure is found to be borrowed in a sample of case studies of 
structural borrowing. Until recently, most measures of frequency had been based 
on casual observations rather than on strict sampling; nevertheless, there is often 
tacit agreement about which structures are more frequently found to be borrowed. 
The other measure is the duration and intensity of contact that is required, in 
relative terms, to license the borrowing of a particular structure (by comparison 
to others). Thomason and Kaufman’s (1988) frequently cited borrowing scale oper- 
ates on the basis of this kind of observation. It lists various structural categories 
in groups that are more and less likely to be borrowed, indicating both relative 
frequency of cases of borrowing, and the hypothesized relative time depth of con- 
tact needed for borrowing. On that particular scale, the rather vague category of 
“function words,” for example, figures higher (ie. more “borrowable”) than for 
example “word order”: 


(7) Thomason and Kaufman's (1988) borrowing scale 
Casual contact Category 1: content words 
Category 2: function words, minor phonological features, 
lexical semantic features 
Category 3: adpositions, derivational suffixes, phonemes 
Category 4: word order, distinctive features in phonology, 
inflectional morphology 
Intense contact Category 5: significant typological disruption, phonetic 
changes 


Another way of measuring “ease” of borrowing is by examining the frequency 
of borrowed items by category in a particular corpus, based on a particular case 
study: 
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(8) Haugen (1950: 224), on Norwegian and Swedish immigrant speech in the US: 
nouns > verbs > adjectives > adverbs, prepositions, interjections 


(9) Muysken (1981), on Spanish in Quechua (repeated by Winford 2003: 51): 
nouns > adjectives > verbs > prepositions > coordinating conjunctions > 
quantifiers > determiners > free pronouns > clitic pronouns > subordinating 
conjunctions 


The obvious difficulties with such an approach are keeping apart the particular 
conditions for borrowing, and the natural occurrence frequency of a category in 
a corpus, as well as possible structural constraints or even sociocultural factors 
that may be unique to the individual case study. 

An alternative approach to category borrowability is the attempt to identify 
structural factors that facilitate borrowing. Prioneered by Moravcsik (1978), the 
context of this approach is the study of universals of language, their structural 
manifestations, and the functionality that governs them. Moravcsik follows, to some 
extent, Weinreich’s (1953: 35) prediction that tight integration of a morpheme will 
limit its borrowability, but goes beyond that to identify semantic autonomy as 
a factor favoring borrowability. Lexical items are thus more borrowable than non- 
lexical items, nouns are more borrowable than non-nouns, free morphemes more 
than bound morphemes, and derivational morphology more than inflectional 
morphology. Both Johanson (2002) and Field (2002) revisit these tendencies and 
conclude that semantic transparency and a consistent form—meaning relationship 
facilitate borrowing. Field (2002) proposes the following hierarchy: 


(10) content item > function word > agglutinating affix > fusional affix 


Once again, the hierarchical arrangement represents both quantity (more content 
items are borrowed than function words, and so forth), and temporality (content 
items are borrowed earlier in the history of contact than function words, and so 
forth). The observed structural constraint is considered to some extent to be self- 
explanatory: Items that convey transparent meaning are more easily acquired. Their 
consistent meaning allows them to be replicated in different structural environ- 
ments and in different interaction contexts. And it is also reflected in their struc- 
tural autonomy, which facilitates their integration into another language. The 
question that is not asked in this connection is what motivates borrowing in 
the first place. Rather, it is taken for granted that the motivation for borrowing 
is extra-linguistic, in that speakers feel pressure to demonstrate competence in 
a prestige language, or else that it is internal to language, in the sense that 
speakers generalize certain vocabulary items across their repertoire for the sake 
of convenience, irrespective of the function of these vocabulary items, as long as 
there are no structural obstacles that stand in the way of their integration into the 
recipient language. Absence of transparency and absence of structural autonomy 
are considered potential obstacles that inhibit borrowing. 
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4.3. Implicational hierarchies and motivations 
for borrowing 


While the hierarchies formulated by both Moravesik (1978) and by Field (2002) 
are presented in the forms of implicational hierarchies, they do not explicitly set 
out the preconditions that are met at a higher position in the hierarchy and which 
enable borrowing at a lower position. In other words, the theme of the hierarchy 
is one that is regarded as facilitating, not as motivating borrowing. Agglutina- 
tive morphs are thus simply more likely to be borrowed than fusional morphs 
since they satisfy the conditions that facilitate borrowing more strongly and more 
consistently, but the borrowing of agglutinating morphs does not constitute a 
prerequisite for the borrowing of fusional morphs. Moreover, the facilitating 
conditions themselves do not, as I pointed out earlier, explain the motivation for 
borrowing. 

A different kind of approach to implicational hierarchies of borrowing isolates 
individual values of a single category — or paradigmatic values — for comparison 
in respect of their borrowing behavior. Already Stolz (1996) identifies a correlation 
between different function words that are borrowed from Spanish into Central 
American and Pacific languages: 


(11) Implications for the borrowing of Spanish function words in Central 
American and Oceanic languages (Stolz 1996): 
a. If a language has borrowed porque ‘because’, then it will always have 
borrowed pero ‘but’. 
b. If a language has borrowed more than two function words [from 
Spanish], then pero ‘but’ is among them. 


But the category of “function words” remains a vague one, based largely on 
structural criteria, and no common denominator can be isolated to account for 
the findings that are presented in Stolz’s study. When the data are re-examined, 
however, and attention is given to values of consistent semantic-pragmatic 
categories, a pattern emerges that is confirmed from other contact situations as 
well. In Matras (1998a) I examined a sample consisting of (a) various dialects 
of Romani in contact with different languages (such as French, Hungarian, 
Romanian, Turkish, and Greek); (b) a sample of languages under the historical 
sphere of influence of Arabic, either directly or mediated via other languages such 
as Turkish or Persian (including Hausa, Swahili, Kurdish, Neo-Aramaic, Turkish, 
Lezgian, Macedonian, Persian, Urdu, and more); and (c) Stolz and Stolz’s (1996) 
sample of some 40 Central American languages in contact with Spanish. For the 
set of coordinating conjunctions, all samples showed the following implicational 
borrowing hierarchy, i.e. the item on the right is only borrowed if item on the 
left is also borrowed: 


(12) but > or >and 
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Since we are dealing here with values of the same word class and the same func- 
tional sub-category, namely coordinating conjunctions, it is possible to reduce the 
opposition between the values to a single semantic-pragmatic feature, namely 
the expression of “contrast.” On this basis we can postulate a link between the 
expression of contrast and the likelihood of borrowing; we can thus isolate the 
semantic-pragmatics of contrast as a factor motivating borrowing. Why should 
contrast act as a driving factor for borrowing? The explanation offered in Matras 
(1998a) for the high susceptibility of contrast to borrowing is the interaction- 
level tension surrounding the act of contradicting a shared presupposition. 
It is hypothesized that this tension puts a strain on the speaker’s processing of 
language, which may in turn interfere with the selection and inhibition mechan- 
ism (cf. Green 1998) that controls the retrieval of “language-correct” items from 
the bilingual repertoire. This may result in a malfunction of the selection and inhib- 
ition mechanism and the production of a functionally correct item — a contrastive 
conjunction — but in the “wrong language” (a form that is not context-appropriate). 
Evidence in support of this hypothesis has been presented in the form of a corpus 
of bilingual speech-production errors of this kind, which tend to target connectors 
and discourse markers in general, and expressions of contrast in particular (see 
Matras 2000; 2007a). 

Now, not all speech production errors (“interference” or “transfers”) have the 
potential to lead to language change. But we must treat them as in some ways 
similar to the spontaneous innovations by individual bilingual speakers that 
were discussed above. Although they are not “strategic,” but in a sense “counter- 
strategic” and quite often subject to immediate self-repair by bilingual speakers, 
under certain circumstances errors of this kind may remain unnoticed, unrepaired, 
and uncorrected by the speaker and interlocutor. This might occur in situations 
where there is full acceptance of bilingualism, lax normative pressure to conform 
with a particular image of “correct” language, and tolerance toward word forms 
from the particular donor language in question. Romani provides a good example, 
since it is always spoken in a community of bilinguals, it is never the dominant 
language of public life, and although language loyalty is important in many com- 
munities, it is often projected onto a rather modest set of basic vocabulary items 
and their inflection, which serve as adequate tokens of ethnic-linguistic separateness. 
Infiltration of connectors and other word forms, lexical and grammatical, from 
the surrounding languages is thus widely tolerated. Central American languages 
are similarly useful instances, since Spanish dominates as the language of pub- 
lic life, urban life, education, and economic activity, and its infiltration into the 
indigenous ethnic languages is taken for granted. 

In these kinds of situations there is fertile ground not just for the acceptability 
of speech production errors as appropriate, but also for their propagation. The 
generalization of one form at the expense of the corresponding form or structure 
of the recipient language constitutes a kind of compromise in the management 
of the bilingual repertoire. It allows the speaker, on the one hand, to simplify con- 
trol and management of the bilingual repertoire by reducing the effort needed to 
retrieve the correct structure around processing functions that are likely to place 
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a strain on the selection mechanism. On the other hand, by limiting such “fusion” 
of subsets within the repertoire to just a limited number of semantic-pragmatic 
functions, the speaker is retaining the overall principle of context-appropriate selec- 
tion of structures — or “language differentiation” — within the repertoire (for a more 
elaborate discussion see Matras 2009, ch. 4). 

Thus, by isolating the semantic-pragmatic feature that characterizes the borrow- 
ability cline for paradigm values of a single functional category, we are able to 
extract insights into the factors that motivate borrowing — in this case, the expres- 
sion of contrast — and to offer a hypothesis about the genesis of the borrowing 
process. The very same principle can be applied to a series of category value sets, 
as carried out by El8ik and Matras (2006) for a sample of some 75 Romani dialects, 
and by Matras (2007b) for a cross-linguistic sample based on reports for some 30 
case studies. Both investigations confirm a correlation between borrowability 
and the semantics of elements that convey relative vulnerability of the speaker’s 
assertive authority, as in the case of contrast (which is likely to be received 
hesitantly by the listener). This underscores the hypothesis of a correlation 
between a higher strain on the speaker’s processing effort in the interaction (in 
pursuit of the listener’s attention and trust), and weaker control of the selection 
and inhibition mechanism that is responsible for context-appropriate choices 
of structures and items within the multilingual linguistic repertoire. The data 
discussed in these studies confirm, for example, the higher susceptibility to 
borrowing of modality (insecure knowledge), of obligation (external force), con- 
dition (non-real event), purpose and factuality (weak semantic integration of 
an event presented as one whole), cause (justification attempt), the superlative 
(a marked contradiction to a presupposed set), and indefinites (absence of clear 
identification of the referent): 


(13) Further borrowing hierarchies (Elsik & Matras 2006; Matras 2007b): 
modality > aspect/aktiosnart > future tense > (other tenses) 
obligation > necessity > possibility > ability > desire 
concessive, conditional, causal, purpose > other subordinators 
factual complementizers > nonfactual complementizers 
superlative > comparative > (positive) 

indefinites > interrogatives > (other) deixis, anaphora 


mo ao op 


Also high on the borrowing hierarchy are items that convey the speaker’s monitor- 
ing and directing of the interaction — discourse markers, fillers, tags, interjections, 
focus particles — control over which is subject more to automatized routine rather 
than to reflection and intent, and which on these grounds also easily escape the 
speaker’s control over the selection and inhibition mechanism. 

Finally, another type of motivation can be extracted from empirically attested 
borrowing hierarchies. Once again, the relevant studies are the corpus of Romani 
dialects discussed in El8ik and Matras (2006), and the cross-linguistic sample of 
case studies discussed in Matras (2007b). The relevant theme here is the elim- 
ination of the need to select among language alternatives - in other words, 
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structural “fusion” within the repertoire for a particular category — around items 
that can be closely related to activities that are more likely to be performed in the 
donor language. In essence, we are talking about speakers generalizing a routine 
which they perform more regularly in one of their languages, for use across the 
linguistic repertoire, irrespective of interaction context and choice of language. 
By contrast, on the opposite end of the cline we find items that are relatively resilient 
and are “protected” from borrowing through their association with a routine that 
is normally performed in the recipient language: 


(14) Context-bound, routine-based hierarchies: 
a. unique referents > general/core vocabulary 
b. nouns > non-nouns 
c. numerals in formal contexts > numerals in informal contexts 
d. higher cardinal numerals > lower cardinal numerals 
e. days of week > times of day 
f. peripheral local relations > core local relations 
remote kin > close kin 


ga 


I use the term “unique referents” to capture the value of names of institutions, 
customs, and other activity routines that are associated with a specific socio- 
cultural setting. The association with a specific routine prompts the bilingual 
speaker to activate those associations whenever mention is made of the referent, 
and so to maintain its original form rather than attempt a translation or paraphrased 
rendering. A similar principle applies to names for culture- or environment-specific 
objects, artifacts, instruments, flora and fauna, and so on; this provides a function- 
oriented explanation for the greater borrowability of nouns over non-nouns. 
Numerals are borrowed more often in formal contexts — for example when citing 
dates or commercial quantities, or in connection with commercial transactions — 
where they are associated with the language of the institutional domain and 
commerce. Among cardinal numerals, higher figures tend to be borrowed before 
lower figures, the latter being protected by the routine of everyday counting in 
the recipient language, the former being more typical of institutional settings (school, 
trade, administration, and so on). Terms for days of the week are more readily 
associated with the language of the public domain than times of day, which are 
again part of everyday routines. Finally, complex local relations (such as “against,” 
“opposite,” “around”) and more remote kin (e.g. “uncle,” “grandparent”) are more 
prone to be substituted by public domain speech routines, while their more basic 
and proximate counterparts — local relations such as “on,” “in,” or close kin such 
as “daughter,” “father” — are protected by the language of everyday household 
routines. 


Wu 


5 Conclusion 


Our discussion of both convergence (pattern replication) and borrowing (matter 
replication) has shown that filling “gaps” in the replica or recipient language is 
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not a primary motivation for contact-induced language change, save perhaps in 
the adoption of lexical labels for new artifacts and concepts. It certainly does not 
explain the motivation for change in grammatical structure. Nor is the prestige 
of the model or donor language a motivation for change, but at most an extra- 
linguistic condition that enables change. Structural features of linguistic constructions 
and categories may facilitate borrowing, but the fact that clear trends can be 
identified in samples that cross the boundaries of various linguistic types indi- 
cates that the overall structural composition of both the donor and the recipient 
language are secondary factors in facilitating or impeding borrowing. The same 
can be claimed for different combinations of cultures; while some cultural features 
may support or prevent borrowing, general trends from cross-linguistic samples 
confirm that the impact of culture-specific conditions is secondary. 

The fact that borrowing hierarchies can be postulated for a range of categories 
and category domains, irrespective of the typology of the languages involved and 
of culture-specific factors, indicates that there are universals of borrowing, and 
justifies a universalist approach to borrowing. But what are the implications of a 
universalist approach, and what are the dimensions and parameters within 
which it can be pursued? It was suggested above that the universal factor that 
plays a crucial role in triggering contact-induced change is the language-processing 
mechanism, and the challenge facing the bilingual speaker to manage context- 
sensitive selection of structures and items within a complex repertoire of linguistics 
structures. Bilingual speakers pursue a limited range of strategies in order to facili- 
tate management of the bilingual repertoire, among them the creation of new 
constructions (combining a model from one language with word forms from 
another), and the generalization of just a single set of forms for a particular 
category (or fusion) across the repertoire. The conditions under which the latter 
process takes place can be read from the pattern of borrowing hierarchies, which 
allows us to hypothesize a connection between likelihood of borrowing and (a) 
the effort that is required to successfully manage vulnerable points in the discourse 
interaction, and consequent loss of control over the selection and inhibition 
mechanism, as well as (b) the association of particular linguistic tasks with 
routines, and the generalization of those associations across the repertoire. 
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4 Contact and 
Grammaticalization 


BERND HEINE AND TANIA KUTEVA 


1 Introduction: Contact-Induced 
Grammaticalization 


Language contact may have a wide range of implications for the languages 
involved, and it may affect virtually any component of language structure 
(Thomason & Kaufman 1988). It manifests itself in the transfer of linguistic 
material from one language to another, typically involving the following kinds 
of transfer: 


(1) Kinds of linguistic transfer: 

form, that is, sounds or combinations of sounds, 

meanings (including grammatical meanings) or combinations of meanings, 
form—meaning units or combinations of form—meaning units, 

syntactic relations, that is, the order of meaningful elements, 

any combination of (a) through (d). 


2 ao op 


Our interest in this chapter is with (1b). Following Weinreich (1953: 30-1; see also 
Heine & Kuteva 2003; 2005; 2006), the terms model language and replica language 
are used for the languages being, respectively, the source (or donor) and the target 
(or recipient) of transfer, and his term replication stands for kinds of transfer that 
do not involve phonetic substance of any kind, that is, for (1b) and (1d) — for what 
traditionally is referred to with terms such as structural borrowing or (grammatical) 
calquing. Thus, by grammatical replication we mean a process whereby speakers 
create a new grammatical meaning or structure in language R on the model of 
language M by using the linguistic resources available in R. 

The term borrowing is reserved for transfers involving phonetic material, either 
on its own (1a) or combined with meaning (1c).’ 

Furthermore, we will distinguish between replication restricted to the lexicon 
(= lexical replication) and replication that concerns grammatical meanings or 
structures (= grammatical replication). As has been shown in Heine and Kuteva 
(2003; 2005; 2006), grammatical replication is essentially in accordance with 
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Contact-induced linguistic transfer 


ye es 


Replication Borrowing 


are 


Grammatical replication Lexical replication 


i 


Contact-induced grammaticalization Restructuring 


geese es 


Rearrangement Loss 


Figure 4.1 Main types of contact-induced linguistic transfer 


principles of grammaticalization; however, there are a few cases that are not, and 
the term restructuring has been proposed for the latter. The model of contact-induced 
transfer used here can summarily be represented as in Figure 4.1. 

A theoretical position underlying the majority of works on grammaticalization 
is that grammaticalization is a language-internal process, and that grammaticaliza- 
tion contrasts with language contact, which is viewed as providing an alternative 
explanation for language change processes. Notice that the very introduction of 
the notion of grammaticalization was an attempt to complement the taxonomy 
of the three major types of language change identified by the Neogrammarians 
and de Saussure, namely sound change, analogy, and borrowing. Thus when Meillet 
(1912) first proposed the term grammaticalization (which — unlike analogy — 
creates new categories for which no previous patterns exist) he set a standard for 
treating grammaticalization as something other than contact-induced phenomena 
(such as borrowing). 

A perhaps extreme version of the view that language-internal change and 
language contact are mutually exclusive can be seen in the assertion that the 
former is portrayed as “natural,” whereas contact-induced change (Ross 2003), 
or language-external change, is “non-natural” (for an explicit articulation of this 
standpoint, see Trudgill 1983: 102). In this spirit, Lass (1997: 199, 209) argues that 
whenever there is a possibility of dual or multiple origin for a given feature in a 
language, the most “parsimonious” explanation for a linguistic innovation is one 
in terms of internal development (“endogeny,” Filppula 2003) rather than of lan- 
guage contact, that is, of external development: Endogenous changes, he argues, 
occur in any case “whereas borrowing is never necessary.” 

In the model that we propose, language contact and grammaticalization are 
not mutually exclusive; rather, they may work in conspiracy with each other. 
Accordingly, both language-internal change and contact-induced change are nat- 
ural. Our model thus is an attempt to capture the inner dynamics of a complex 
process involving no fewer than the following factors: 


(2) a. universals of human conceptualization and grammaticalization, 
b. language contact, and 
c. the sociolinguistic and pragmatic setting. 
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The goal of grammaticalization theory is to describe grammaticalization, that is, 
the way grammatical forms arise and develop through space and time, and to 
explain why they are structured the way they are. Contact-induced grammat- 
icalization is a grammaticalization process that is due to the influence of one lan- 
guage on another; instances of it have been discussed in a wide range of studies 
(see especially Campbell 1987; Stolz 1991; Haase 1992; Heine 1994; Harris and 
Campbell 1995; Nau 1995; Bisang 1996; Kuteva 1998; 2000; Dahl 2000; Heine & 
Kuteva 2002; 2003; 2005; 2006; Stolz and Stolz 2001; Aikhenvald 2002), even if a 
number of these studies do not use a framework of grammaticalization theory. 

There is a widespread assumption among linguists that grammatical structure, 
or syntax, cannot be “borrowed,” that is, transferred from one language to 
another. This assumption is reflected in a recent survey article by Sankoff (2002), 
who concludes that “whether or not ‘grammar’ or ‘syntax’ can be borrowed at 
all is still very much in question ... many students of language contact are con- 
vinced that grammatical or syntactic borrowing is impossible or close to it” 
(Sankoff 2002; see also Silva-Corvalan 2007). We consider this no longer to be an 
issue, considering that there is by now abundant evidence to demonstrate that 
both grammar and syntax can be “borrowed” or, as we will say here, replicated 
(see e.g. Ramisch 1989; Ross 1996, 2001, 2003; Johanson 1992, 2002; Aikhenvald 
2002; Heine and Kuteva 2003, 2005, 2006), and the present paper will provide fur- 
ther evidence in support of this observation. 


2 Process versus Product 


Our main concern in this chapter is with contact-induced grammatical replica- 
tion as a product, for which there is some cross-linguistic evidence, and we will 
have little to say about the process leading to this product since it is still in the 
main poorly understood. The following remarks are meant to provide at least some 
general understanding of the nature of this process, which has both a socio- 
linguistic and a linguistic component. 

We assume that at the beginning of the process as a sociolinguistic phe- 
nomenon there typically is spontaneous replication in bilingual interaction, where 
an individual speaker — consciously or unconsciously — propagates novel features 
in the replica language that have been influenced by some other language (or 
dialect). Spontaneous replication, described with references to notions such as 
“speaker innovation” (Milroy & Milroy 1985: 15), is highly idiosyncractic and the 
vast majority of instances of it will have no effect on the language concerned, being 
judged as what are commonly referred to as “speech errors.” But some instances 
may catch on: Being taken up by other speakers and used regularly, they may 
become part of the speech habits of a group of speakers (early adopters), and they 
may spread to other groups of speakers — in exceptional cases even to the entire 
speech community. Still, this process does not necessarily lead to linguistic 
change: Such innovations may remain restricted to some specific period of time, 
being abandoned either by the very speakers who introduced them or by the next 
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Table 4.1 From minor to major use pattern in grammaticalization (Heine & 
Kuteva 2005; 2006) 


Stage Frequency Context Meaning 
0 Low frequency _ Restricted Weakly grammaticalized 
I Increase in Extension to new An additional, more grammatical 
frequency contexts meaning may emerge in the new 
contexts 
II High Context Generalization of the new 
frequency generalization grammatical meaning 


generation of speakers. It is only if an innovation acquires some stability across 
time that grammatical replication has taken place. 

All the data that are available suggest that grammatical replication as a linguistic 
process is grounded in discourse pragmatics and semantics rather than in syntax. 
It may concern a single lexical or grammatical item (item extension), a cluster of 
several items (pattern extension), or a natural class of items (category extension). 
Contact-induced innovations tend to be confined to an increase in frequency of 
use and an extension of existing pieces of discourse structure to new contexts, 
some of which are associated with novel meanings, and they also tend to be confined 
to specific semantic and/or lexical features. Some form of consistency is attained 
when, as a result of language contact, clusters of discourse pieces turn into new 
use patterns, or when existing minor use patterns turn into major ones, as 
sketched in Table 4.1. It is only a minor fraction of new use patterns that will 
eventually develop further into new distinct grammatical structures of grammatical 
replication, such as functional categories or new forms of syntax. 

With these remarks we wish to draw the reader’s attention to the fact that dis- 
cussion in this chapter takes care only of a highly limited range of phenomena 
that have to be taken into account when studying contact-induced grammatical- 
ization in particular and grammatical replication in general. 


3 Ordinary versus Replica Grammaticalization 


A distinction between two kinds of processes of grammatical replication is pro- 
posed by Heine and Kuteva (2005, ch. 3). Speakers may create a new use pattern 
or category that is equivalent to a corresponding category in the model language 
either by drawing on universal principles of grammaticalization (= ordinary 
grammaticalization), or else by replicating the process that they observe in the 
model language (= replica grammaticalization). 

The following example illustrates the significance of ordinary grammaticaliza- 
tion. Eastern Oceanic languages of northern and central Vanuatu commonly 
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distinguish a durative aspect indicating that an act is in progress. Apparently 
in an attempt to find an equivalent for such a category in Bislama, the English- 
based pidgin of Vanuatu, speakers used an expression commonly recruited cross- 
linguistically to develop progressive and durative aspect markers (Heine & 
Kuteva 2002: 127, 198). They chose a use pattern involving their verb stap ‘stay, 
be present, exist’? to develop a durative aspect marker, which appears in the same 
syntactic slot as the durative markers in the Oceanic model languages’ (Keesing 
1988; 1991: 328); cf. the following sentences, where (3a) illustrates the replica and 
(3b) the model structure:* 


(3) a. Bislama (English-based pidgin; Keesing 1991: 328) 

emi stap pik- im yam. 

he he- DUR dig- TRS yam 

‘He’s in the process of digging yams.’ 

b. Vetmbao (Malekula, Oceanic; Keesing 1991: 328) 

najing-u- — xoel dram. 

he he- DUR- dig yam 

‘He’s in the process of digging yams.’ 


That this is an instance of ordinary grammaticalization is suggested by the fact 
that the Oceanic durative markers did not provide any model for how to create a 
durative marker in Bislama. The pidgin speakers therefore drew on a universal 
conceptual strategy (see Bybee, Perkins, & Pagliuca 1994) by grammaticalizing a 
lexical verb to an aspect marker that is functionally and syntactically equivalent 
to that of the Oceanic model languages. 

The significance of the distinction between ordinary and replica grammatical- 
ization can be illustrated with example (4), which involves the latter. In the course 
of half a millennium of contact with Italian, speakers of Molisean (Molise Slavic, a 
language derived from Croatian and spoken on the eastern coast of central-southern 
Italy) created an indefinite article by grammaticalizing their numeral for ‘one’.° 
In doing so, they appear to have replicated a situation that they observed in their 
model language, Italian (Breu 2003a). That the articles in the two languages are 
largely equivalent can be shown with the following example, where both languages 
have an optional zero article (O) in certain generic uses: 


(4) Molisean versus Italian (Breu 2003a: 42) 
Molise Slavic Ona je na studentesa./O profesoresa. 
Italian Lei é una studentessa./O professoressa. 
‘She is a student/a professor.’ 


But Molisean speakers did not acquire a definite article, that is, they did not 
similarly replicate the Italian definite article. Breu attributes this to the fact 
that Italian provided a model for the indefinite but not for the definite article: 
Whereas the Italian numeral un- ‘one’ and the indefinite article have similar forms, 
and hence allowed Molisean speakers to establish a conceptual link between 
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the two, the Italian demonstratives show no formal similarity with the definite 
article. Accordingly, Molisean speakers acquired an indefinite article via replica 
grammaticalization, but creating a definite article would have required ordinary 
grammaticalization, which these speakers did not make use of, hence they did 
not develop a definite article. 


4 Grammaticalization versus Polysemy Copying 


Breu (2003b) reports the following case of grammatical replication in the contact 
situation between Italian, the dominant language, and Molisean which has 
been in contact with Italian for roughly 500 years: The Italian verb portare is 
polysemous, meaning both ‘carry’ and ‘drive a car’, while the pre-contact 
Slavic verb nosit only meant ‘carry’. Speakers of Molisean replicated the Italian 
polysemy by extending the meaning of nosit to include both ‘carry’ and ‘drive 
a car’. This process is called polysemy copying by Heine and Kuteva (2005, 
section 3.2). Polysemy copying (usually discussed under the rubric of calquing or 
loan translation) is fairly common in lexical replication, as in the present example, 
but appears to be rare in grammatical replication, where a more complex process 
tends to be involved, namely grammaticalization (see Heine 2007 for detailed 
discussion). 

The following example may illustrate the latter process. Slavic languages are 
renowned for their lack of articles, that is, of markers whose primary function it 
is to express definite or indefinite reference. The situation is, however, slightly 
more complex. There is a gradually decreasing degree of grammaticalization of 
definite and indefinite articles, respectively, as we move from Western European 
to Central European to Eastern European language varieties (ie. languages, 
dialects, and regional varieties). In other words, with reference to the situation 
among the Slavic languages, there is hardly any discernible grammaticalization 
in the easternmost languages (Standard Russian, Belarusian, Ukrainian), and the 
more one moves west and south, the higher the degree of grammaticalization of 
articles becomes. Given this geographical linguistic distribution, Heine and Kuteva 
(2006) argue that language contact was instrumental in the grammaticalization of 
articles in the majority of the cases discussed, especially for the following reasons. 
First, this is overwhelmingly the conclusion reached by experts who have dealt 
in some detail with the languages concerned (e.g. Breu 1994; 2003a; 2004; Lotzsch 
1996). Second, there is sociolinguistic evidence to the effect that it is exactly those 
languages that are known to have had a long history of contact with article lan- 
guages such as German, Italian, or Greek that exhibit at least minor use patterns 
of articles. 

With reference to whether it is possible to reduce grammatical replication to a 
question of polysemy copying, this case is of interest for the following reason. 
From a grammaticalization perspective, it is the following stages that mark the 
gradual pragmatic and semantic evolution of many indefinite articles (Heine 1997: 
7Off.): 
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1 An item serves as a nominal modifier denoting the numerical value ‘one’ 
(numeral). 

2 The item introduces a new participant presumed to be unknown to the hearer 
and this participant is then taken up as definite in subsequent discourse 
(presentative marker). 

3 The item presents a participant known to the speaker but presumed to be 
unknown to the hearer, irrespective of whether or not the participant is ex- 
pected to come up as a major discourse participant (specific indefinite marker). 

4 The item presents a participant whose referential identity neither the hearer 
nor the speaker knows (nonspecific indefinite marker). 

5 The item can be expected to occur in all contexts and on all types of nouns 
except for a few contexts involving, for instance, definiteness marking, proper 
nouns, predicative clauses, etc. (generalized indefinite article). 


Grammaticalization theory would predict that if a language has reached a given 
stage then it has also passed through all preceding stages, and this evolutionary 
scale can be used synchronically as an implicational scale. Both German and Italian 
have definite articles of stage 4, but neither has stage 5. Slavic languages on the 
other hand are widely believed to lack articles. That this belief is in need of revi- 
sion has been demonstrated most of all by Breu (2003a) in his seminal analysis of 
Slavic “micro-languages.” We are not able to do justice to the fine-grained analysis 
presented by Breu (2003a); the reader is referred to this work for many more details. 

The following discussion is confined to one of the micro-languages spoken at 
the western periphery of the Slavic-speaking territory, namely Upper Sorbian,‘ 
which has been in contact with the dominant language German for nearly one 
millennium. In the examples below, sentences from Upper Sorbian (US) are 
given, followed by German (G) and English translations; there are no interlinear 
glosses in Breu’s publication (the markers in question are printed in bold, O stands 
for lack of article). 

Like German, Upper Sorbian has a fully grammaticalized stage 2, 3, and 4 
indefinite article that can be traced back to the numeral jen- ‘one’.’ As the fol- 
lowing examples show, their use is obligatory (where @ = no article and *@ = the 
article may not be omitted). Example (a) illustrates the presentative use 
(stage 2), characteristic of openings in tales, (b) the specific indefinite stage 3, and 
(c) the nonspecific indefinite stage 4. With abstract and generic referents as well, 
Upper Sorbian shows roughly the same degree of grammaticalization as the German 
indefinite article does, cf. (d). 


(5) Upper Sorbian (Breu 2003a: 37 ff.) 


a. Stage 2 US To béSe jemo jena stara Zona. *O 
G_ Es war einmal eine alte Frau. *O 

‘Once upon a time there was an old woman.’ 
b. Stage3 US Najmole jo jen totsty muz nut? Sisot. *O 
G __ Plotzlich kam ein dicker Mann herein. *O 


‘Suddenly a fat man came in.’ 
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c. Stage4 US Dy tybe jen polcaj stosi, ton tybe zasperwe. *O 
G = Wenn dich ein Polizist hort, wird er dich einsperren. *O 
‘If a policeman hears you, he’ll arrest you.’ 


Breu (2003a) describes a parallel case from another Slavic micro-language, 
namely Molisean of Italy, which also can be assumed to have lacked article-like 
grammatical forms prior to language contact with the article language Italian. Like 
the Upper Sorbian numeral for ‘one’, the Molisean one disposes of a paradigm 
of morphophonological distinctions, one difference being that the Molisean 
forms have long and short forms in addition, e.g. je’na versus na (nominative 
masculine singular). 

Like Upper Sorbian, Molisean shows roughly the same degree of grammat- 
icalizaton of the numeral, having developed a stage 4 indefinite article of the 
same kind as the model language Italian. The reader is referred to Breu (2003a) 
for examples; suffice it to illustrate the more advanced stages. In the examples 
below, sentences from Molisean (M) are given, followed by Italian (I) and English 
translations (once again there are no interlinear glosses; the markers in question 
are printed in bold, © stands for lack of article). The (a) example shows the use 
with a stage 3 article with an abstract noun, while (b) shows a generic use of 
stage 4, where use versus non-use of the indefinite article appears to be lexically 
determined. Note that in both examples, the replica and the model languages agree 
to the extent that both can be used with and without article. 


(6) Molisean (Breu 2003a: 42) 
a. M Jo, sa jima na strah! or B 
I Ahi, ho avuto una paura! = or © 
‘Boy, was I scared!’ 
b. M_ Ona je na Studentesa./O profesoresa. 
I Lei é una studentessa./O professoressa. 
‘She is a student/a professor.’ 


As these examples show, Molisean speakers, like Upper Sorbian speakers, have 
carried their numeral through all stages of grammaticalization, developing a stage 
4 indefinite article largely equivalent to the Italian model. 

But the situation is different in other Slavic languages. Table 4.2 shows in 
particular two things. First, it is exactly those languages which have had the most 
intense contact with languages having full-fledged (stage 4) indefinite articles that 
also have created corresponding articles. At one end there are Upper Sorbian and 
Molisean; at the other end there are Ukrainian and Belorussian, both languages 
with the least amount of contact with article languages. Second, Table 4.2 also 
shows that the contact-induced grammaticalization proceeds in one direction 
from one stage to the next, where a new stage is built on the stage immediately 
preceding it. Synchronically, this fact can be described in the form of an impli- 
cational scale of the following kind: If a given article has stage X then it also has 
all preceding stages. 
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Table 4.2 Degree of grammaticalization from numeral ‘one’ to indefinite article 
in selected Slavic languages (sources: Breu 2003a; Heine & Kuteva 2006, ch. 3) 


Stage Function Upper Molise Macedonian Czech, Serbian, Ukrainian, 
Sorbian Slavic Bulgarian Croatian, Belorusian 
Polish, 
Russian 
1 Numeral ‘one’ + + + + + + 
2 Presentative + + + + (+) 
3 Specific + + + (+) 
indefinite 
4 Nonspecific = + + 
indefinite 


Note that we are restricted here to nonstandard, colloquial, varieties of the languages concerned. 
As Breu (2003; 2005) has shown for Upper Sorbian, an entirely different picture would arise if 
Standard Upper Sorbian were chosen. 


This example confirms what has been observed in other cases where we 
have comparative data on grammatical replication, e.g. on the possessive perfect 
(Heine & Kuteva 2006, ch. 4) or on auxiliaries in European languages (Heine & 
Miyashita 2008). Rather than replicating a grammatical category in toto, speakers 
start out with the replication of the initial stages of grammaticalization, and it 
requires a situation of long and intense contact for the replica category to attain 
the same degree of grammaticalization as the corresponding category of the 
model language. This constraint on contact-induced grammatical replication 
suggests that, at least in cases such as the ones mentioned, there is really no 
polysemy copying; rather, what language contact triggers is a gradual process from 
less to increasingly more grammatical structure — a process that may end up in 
a fully equivalent replica category, as we observed in Upper Sorbian and Molisean. 
In most of the cases that have been reported on grammatical replication, how- 
ever, the process does not run through its full course; rather, the replica categories 
tend to be less grammaticalized than the corresponding model categories. 


5 Propelling versus Accelerating Forces in 
Language Contact 


We distinguish two kinds of forces moving grammaticalizing structures along 
grammaticalization paths: propelling forces and accelerating forces. Propelling forces, 
which are causally responsible for grammatical change to happen, include uni- 
versal mechanisms of change such as metaphor and metonymy, which are usu- 
ally realized through inference (Bybee et al. 1994: 281-93) and context-induced 
reinterpretation (Heine, Claudi, & Hiinnemeyer 1991). It has been recognized by 
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students of grammaticalization that such universal mechanisms “drive” gram- 
maticalizing material along grammaticalization paths. What usually goes un- 
recognized in the mainstream literature on grammaticalization, however, is that 
language contact — rather than being the “non-natural” alternative to the “natural” 
universal cognitive factors in language change - can also be a propelling force; it, 
too, can drive a lexical or a grammatical morpheme along a grammaticalization 
path. Which of the two forces, or to which extent each of them, is at work in a 
given case of contact-induced language change is hard to determine on the basis 
of the little information that exists on attested cases of language contact processes. 

The following might be an instance of a propelling force. A grammaticaliza- 
tion process that can be said to be characteristic of “Standard Average European” 
languages (Haspelmath 2001) is the one leading from question words to markers 
of clause subordination, and from interrogative constructions to complement and 
adverbial clause constructions, cf. English Who came? versus I don’t know who came; 
for more details of this process, see Heine and Kuteva (2006, ch. 6; 2007b, ch. 5). 
This process appears to be largely a European and an Indo-European phe- 
nomenon; if attested elsewhere, this is likely to be in languages that have had 
some history of contact with European languages. Documented examples of a 
process whereby a non-Indo-European language grammaticalized one or more of 
its question words to clause subordinators on the model of a European language 
include the following (see Heine & Kuteva 2006, ch. 6): 


(7) a. Basque on the model of Spanish, Gascon, and French (Hurch 1989: 21; 
Trask 1998: 320), 
. Balkan Turkish on the model of Balkanic languages (Matras 1998), 
c. the North Arawak language Tariana of northwestern Brazil on the model 
of Portuguese (Aikhenvald 2002: 183), 
d. the Aztecan language Pipil of El Salvador on the model of Spanish 
(Campbell 1987: 259-60). 


From what we know about the nature of the replica languages concerned it is 
unlikely that a process from interrogation to subordination marker would have 
happened without language contact; hence, there is reason to maintain that in these 
cases, language contact was a sine qua non for grammatical change to take place 
— in other words, this appears to be an example where language contact must 
have acted as a propelling force, as a trigger for the grammaticalization. 

It is more difficult to identify clear cases where language contact acted as an 
accelerating force in grammatical change, that is, where it was causally respon- 
sible for speeding up a change that would have happened anyway: To be sure, 
the literature on language contact offers a number of examples that are intuitively 
suggestive of acceleration, but there are hardly any documented cases that would 
allow us to establish beyond reasonable doubt that language contact was the only, 
or at least the decisive, factor in contact-induced grammaticalization, or any 
other kind of grammatical replication. Problems that one is confronted with here 
concern, on the one hand, the question of how degrees of speed in grammatical 
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change can be measured; on the other hand, they also relate to the fact that 
contact-induced grammatical change is a complex process involving a range 
of different factors, and it is in most cases unclear which of the factors exactly 
contributes what to a given process. 

Still, there are examples that can be taken to illustrate the effects of accelera- 
tion. One characteristic of many Indo-European languages, not commonly found 
in other language families, can be seen in the presence of suffixal inflections (degree 
markers) on adjectives to express degrees of comparison, that is, to form com- 
parative and superlative categories (cf. English small, small-er, small-est) or, alter- 
natively, by suppletive forms (cf. good, better, best), both of which tend to be referred 
to as synthetic forms or constructions. These forms have turned out to be vul- 
nerable to loss and replacement, and in a number of Indo-European languages, 
especially the Romance languages, they have been replaced with free forms, that 
is, with analytic constructions (cf. English more, most beautiful). This process can 
be described as one of loss of synthetic constructions, but it is also one of gram- 
maticalization of an analytic construction involving a degree adverb to a com- 
parative construction which replaces the synthetic construction. Replacement has 
affected many European languages and in most cases there is no clear evidence 
that language contact played any role. But there are also cases where contact 
apparently was an accelerating factor. Such cases include in particular Germanic 
languages in contact with Romance languages, which have lost their synthetic 
comparatives except for a few relics: 


(8) a. Analytic forms were still rare in Northern English in the fifteenth cen- 
tury, while in Southern English, which was more strongly exposed to 
French influence, these forms had become current at least a century 
earlier (Danchev 1989: 169). 

b. In German, the synthetic comparative construction is still well established, 
e.g. schon ‘beautiful’, schdén-er (‘beautiful-COMP’) ‘more beautiful’. But in 
the dialect of Luxembourg, which has a history of centuries of intense 
contact with French, instances of an analytic comparative can be found, 
e.g. mehr schin (‘more beautiful’), presumably on the model of French 
plus beau (‘more beautiful’) (Alanne 1972; Danchev 1989: 170). 

c. The Dutch varieties spoken in Flanders, including Algemeen Nederlands, 
have been massively influenced by French, and this influence has also 
had the effect that the analytic pattern with the particle meer ‘more’ is 
used increasingly on the model of French plus, at the expense of the inher- 
ited synthetic construction using the comparative suffix -er (Taeldeman 
1978: 58). 

d. Molisean has a 500-years history of intense contact with the host lan- 
guage Italian, and one of the results of contact was that the speakers of 
Molisean have given up the conventional Slavic synthetic construction 
by replicating the analytic Italian construction, using vece on the model 
of Italian piti ‘more’ as degree marker. Thus, while Standard Croatian 
uses the synthetic form /yepSi ‘more beautiful’, Molisean has vece lip (‘more 
beautiful’) instead (Breu 1996: 26; see also Breu 1999). 
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More examples of this kind can be found in Heine and Kuteva (2006, ch. 2). 
What they seem to suggest is that the grammaticalization of analytic compara- 
tive constructions at the expense of inherited synthetic constructions, while an 
ongoing process in a number of European languages, can be accelerated by 
language contact. 


6 Grammaticalization Areas 


In the course of language contact, a process of grammatical replication may affect 
three or even more languages, thereby giving rise to a grammaticalization area. 
A grammaticalization area exists when there is a group of geographically con- 
tiguous languages that have undergone the same grammaticalization process as 
a result of language contact. 

The following example may give an illustration of it. Passive markers in 
European languages commonly use “be” or “become” as auxiliaries, with the main 
verb being encoded as a perfect participle form or some equivalent of it. But, as 
Ramat (1998) shows, Rhaeto-Romance, Italian, and the Bavarian dialect of German 
share a periphrastic passive construction based on the grammaticalization of the 
lexical verb “come” to a passive auxiliary, cf. (9): 


(9) The Alpine “come”-passive (Ramat 1998: 227-8) 
a. Ladin (Rhaeto-Romance) 
C6 vain  fabrichedala scuola nouva. 
[here comes built the school new]® 
‘Here the new school is being constructed.’ 
b. Italian 
Qui viene costruita la scuola nuova. 
[here comes built the school new] 
‘Here the new school is being constructed.’ 
c. Bavarian (German) 
Da  kummt de nei(e) Schul gebaut. 
[here comes the new school built] 
‘Here the new school is being constructed.’ 


That this is an example of a grammaticalization area is suggested by the follow- 
ing observations: The constructions in the three languages concerned are the result 
of the same general process of grammaticalization from a construction [“come” 
+ perfect participle] to a passive construction, where a lexical verb for “come” 
gave rise to a passive auxiliary. And this process must have involved language 
contact, for the following reasons. First, “come”-passives do not appear to be 
cross-linguistically common’ and it is therefore statistically unlikely that these three 
neighboring languages should have undergone such a rare process independent 
of one another. Second, common genetic inheritance can be ruled out as a con- 
tributing factor: The “come”-passive cannot be traced back to earlier stages of 
Romance or Germanic languages. And third, the languages concerned can be shown 
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to share a history of contact; Ramat (1998: 227-8) notes that there have been cen- 
turies of contact between southern Germany and northern Italy and that it was 
the Romance type that “has influenced the geographically contiguous Bavarian 
passive.” Accordingly, areal diffusion of a grammaticalization process among these 
geographically adjacent communities offers the most reasonable explanation. 

A number of such areas are discussed in Heine and Kuteva (2005, section 5.2). 
As pointed out there, grammaticalization areas may arise in two different ways: 
Either there is replication leading from the model language (M) to the replica lan- 
guage (R,), which again serves as a model for another replica language (R,), as 
sketched in (10a) below, or the process leads straight from the model language 
to two different replica languages (R,, R,), cf. (10b). The synchronic outcome is 
in both cases the same, namely a grammaticalization area consisting of the three 
languages M, R,, and R,. 


(10) Patterns of transfer in grammaticalization areas 
a M > R, > R 
b M > R, 
> R, 


7 Change in Typological Profile 


Contact-induced grammaticalization can vary considerably in the way it contributes 
to language change. On the one hand, it may affect only a limited spectrum of 
grammatical structures of the replica language; on the other hand, its effects can 
be more dramatic in that it can be instrumental in changing the typological profile 
of a language.”” 

Examples of such contact-induced changes in typological profile are discussed 
in Heine and Kuteva (2005, ch. 5); the following example may illustrate the kind 
of changes that are involved in this process. Over three hundred years of lan- 
guage contact in Sri Lanka between the Dravidian language Tamil on the one 
hand and Malay and Portuguese on the other had the effect that the latter two, 
nowadays referred to as Modern Sri Lanka Malay and Modern Sri Lanka 
Portuguese, respectively, developed into creoles with basic SVO word order. More 
recently, the two have replicated grammatical structures of Tamil to the extent 
that they have turned into verb-final languages, developed roughly the same case 
system, evidentials, quotative markers, semantics of verb markers, etc. — just like 
the model language Tamil (Bakker 2000). Grammaticalization played an import- 
ant role in the Tamilization process. 

The following example shows how the change in typological profile from a 
creole to a Tamil-like structure was achieved. Neither of the two creoles had a 
grammaticalized case system. In an attempt to replicate the case system of Tamil, 
speakers of the erstwhile creoles used a universal process of grammaticalization 
in developing adpositional, adverbial, and other independent forms into case 
suffixes, using the lexical resources that were available in the creoles. Thus, 
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Portuguese pera/para ‘for’ was grammaticalized to an accusative/dative case suffix 
-pa, Portuguese junto ‘together’ to a locative case suffix -(u)nto, or Portuguese sua, 
seu ‘his, her, its’ to a genitive suffix -su(wa) in Modern Sri Lanka Portuguese, thereby 
matching the corresponding Tamil case markers (Bakker 2000: 32). 


8 Constraints on Contact-Induced 
Grammatical Change 


In a number of works on language contact, most recently in Thomason (2007), it 
has been argued that there are no absolute linguistic constraints on contact-induced 
change. In fact, there is now a wealth of data on language change processes in 
situations of language contact which all point in the same direction, namely that 
linguistic change in contact situations is fairly unconstrained. Work on grammatical 
replication, some of which is discussed in this chapter, shows however that this 
claim may be in need of revision. The observations made in previous work by 
the present authors (Heine & Kuteva 2003; 2005; 2006) suggest that grammatical 
change in language contact is shaped by universal principles of grammaticaliza- 
tion (see especially Heine et al. 1991; Bybee et al. 1994; Hopper & Traugott 2003), 
and that these principles impose a number of constraints on what is, and what 
is not, a possible contact-induced grammatical change. 

Constraints are chiefly of the following kinds. First, they concern the choices 
that speakers make when looking in the replica language for translational equiva- 
lents of use patterns and categories that they find in the model language. For 
example, in a number of languages, a new future tense category has been created 
on the model of some other language (Heine & Kuteva 2005, section 3.3). In 
creating this category, speakers have drawn typically on a universally available 
strategy by grammaticalizing e.g. a verb for ‘go to’ or ‘want’, but not a verb for, 
say, ‘hit’ or ‘cut’. And in order to replicate an indefinite article of the model lan- 
guage, they most likely will select their numeral for ‘one’, and a demonstrative 
attribute to replicate a definite article, rather than grammaticalizing ‘one’ to a definite 
and a demonstrative to an indefinite article. 

Second, constraints relate to directionality in contact-induced grammatical 
change. To use the examples just mentioned, it has happened time and again that 
a language has developed a future tense’! via the grammaticalization of ‘go’ or 
‘want’, etc., but it is highly unlikely that language contact will have the effect that 
a future tense marker develops into a verb for ‘go’ or ‘want’, an indefinite article 
into a numeral for ‘one’, or a definite article into a demonstrative — in other words, 
grammatical change in language contact situations is essentially unidirectional. 

And third, the development of contact-induced grammaticalization proceeds 
along a step-by-step sequence, as we showed in our example of indefinite articles 
in Slavic languages. Accordingly, it is unlikely that speakers will replicate a highly 
grammaticalized use pattern of a model language unless it has earlier replicated 
less grammaticalized use patterns. 
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To be sure, these are constraints that are not restricted to language contact; rather, 
they concern grammatical change in general. Grammaticalization is a ubiquitous 
process and is essentially unidirectional, irrespective of whether it takes place 
language-internally or in situations of language contact. 


9 Sociolinguistic versus Linguistic Factors 


Language contact is based typically on the social interaction among speakers (or 
writers) using more than one language (or dialect); accordingly, sociolinguistic 
parameters provide an important basis for describing and understanding contact- 
induced linguistic change, as has been aptly pointed out by many writers since 
Thomason and Kaufman (1988). One may wonder, however, whether much is 
gained if the study of language contact is reduced to sociolinguistic methodology. 
Studies on language contact differ greatly on whether they use a sociolinguistic 
or a linguistic framework, or a combination of both. All these frameworks have 
provided valuable insights, and there is no intrinsic reason to assume that any of 
them is superior to any other; which approach is most appropriate depends on 
which aspect of language contact one decides to study and on the kinds of prob- 
lems one wants to solve. 

Grammatical replication is a linguistic process and, accordingly, has been 
approached primarily by means of linguistic methodology (Heine & Kuteva 
2005; 2006). While being ultimately the result of social processes, an analysis of 
it is less dependent on sociolinguistic variables than many other manifestations 
of language contact. 

This can be shown with the following example. A paradigm sociolinguistic 
situation of language contact is one where one of the languages involved is prag- 
matically dominant (Matras 1998: 285) or functions as a dominant code (Johanson 
1992; 2002) and serves primarily as L2 for L1 speakers of the dominated code. As 
has been pointed out by the authors concerned, this distinction is of help for under- 
standing some aspects of both bilingual behavior and contact-induced language 
change; cf. e.g. Johanson’s (1992; 2000: 165-6; 2002: 3) distinction between adop- 
tion and imposition. With reference to grammatical replication, however, this dis- 
tinction is of secondary importance: Replication processes, as we described them 
in this chapter, may proceed in much the same way from L2 to L1 and from L1 
to L2, or from dominant to dominated and from dominated to dominant code. 
In the former case, speakers use their L2 as a model for replication in their L1, 
while in the latter case they replicate features of their L1 in their (use of) L2. 

There are in fact a number of cases showing that grammatical replication may 
proceed in both directions (see Heine & Kuteva 2005, section 6.3; 2006, section 
7.2). One case concerns Austronesian languages such as Tigak that appear to have 
provided a model for replication in the lingua franca Tok Pisin, but Tok Pisin 
also served as a model for Tigak (Jenkins 2002). Another example concerns the 
Basque language, which has replicated a wide range of grammatical structures 
from the dominant Romance languages Spanish and French, as well as from Gascon 
(see Hurch 1989; Haase 1992; 1997); at the same time, however, Basque also acted 
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as a model language for grammatical replication by Spanish speakers in the Basque 
country (Urrutia Cardenas 1995). Further examples are not hard to come by: one 
may mention language contact on the Channel island Guernsey, where English 
both served as a model language for the local Norman variety Guernésiais, and 
as a replica language, developing a number of new grammatical structures on 
the model of Guernésiais (Ramisch 1989; Jones 2002); situations of language con- 
tact between Turkic and Iranian languages, where both served at the same time 
as model and as replica languages (Soper 1987); or the contact setting in north- 
western Amazonia, where the North Arawak language Tariana acted as a replica 
language for a wide range of grammatical structures from the East Tucanoan 
language Tucano, but also provided the model for replication in the Tariana L2 
variety of Portuguese (Aikhenvald 2002). 

The study of a larger range of contact situations suggests indeed that L2 > L1 
replication is about as common as L1 > L2 replication (see Heine & Kuteva 2005 
for examples), and that the processes are roughly of the same kind. It would there- 
fore seem that in this kind of linguistic change, sociolinguistic variables only play 
a minor role and can largely be ignored in the study of grammatical replication 
— even if ultimately contact-induced change is triggered by sociolinguistic factors. 


10 Conclusions 


All the information that is available on language contact suggests that contact- 
induced grammatical replication in general and grammaticalization in particular 
are far more common than has previously been assumed. This is partly due to 
the fact that the latter is a ubiquitous, universally attested process that occurs in 
the same way language-internally and language-externally, and that in many cases 
it turns out to be hard, if not impossible, to determine whether a given gram- 
maticalization process was contact-induced or not (for a set of diagnostics, see 
Heine & Kuteva 2007a). It is also due to the fact that, compared to other forms 
of contact-induced change, grammatical replication has so far received little 
scholarly attention. Especially the earlier stages of the process, where new use 
patterns evolve in language contact, are still largely a terra incognita. 


NOTES 


T. Kuteva thanks the Humboldt Foundation and SOAS, University of London, for their 
support. 

1 There are many alternative terminologies; for example, Thomason and Kaufman 
(1988) or Thomason (2001: 93) use borrowing for both kinds of transfer. 

2 Bislama stap is historically derived from English stop. 

3 Our interpretation of this case rests on Keesing (1988; 1991). Note, however, that this 
grammaticalization is not confined to Bislama, it can also be observed in other 
English-based pidgins such as Tok Pisin and Solomons Pijin, and it remains unclear 
whether the process looked at here took place independently of these other cases. 

4 The following abbreviations are used in interlinear glosses: DUR = durative; TRS = 
transitivizer; COMP = comparative. 
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5 


11 


The numeral has a range of different morphophonological variants depending on case, 
gender, and number; the nominative masculine singular form, for example, is je’na or, 
in its short form, na (Breu 2003a: 34). 

All data analyzed by Breu (2003a) are from a rural, spoken variety of Upper Sorbian, 
which differs considerably from Standard Upper Sorbian, especially with reference to 
the phenomena looked at here. 

The US form for jen- ‘one’ is made up of a complex morphophonological paradigm 
on the basis of distinctions of six cases, two numbers (singular, plural), and three 
genders (masculine, feminine, and neuter) (see Breu 2003a: 36). 

Ramat (1998) does not provide interlinear glosses for these examples; the use of paren- 
theses indicates that the glosses are ours. 

The only other case we are aware of is found in Maltese. 

A “change in typological profile” obtains when a language as a result of grammat- 
ical replication experiences a number of structural changes to the effect that that 
language is structurally clearly different from what it was prior to language contact 
(Heine & Kuteva 2006, section 7.2). 

Note that we are using here and elsewhere a shorthand convention. If we say that “a 
language has developed,” then it goes without saying that languages cannot do such 


things, rather it is the speakers of this language that do them. 
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5 Language Contact and 
Grammatical Theory 


KAREN P. CORRIGAN 


1 Introduction: The Research Context 


Language contact and pidgin/creole studies have become increasingly prominent 
since the ground-breaking conferences held in Mona, Jamaica in the late 1950s 
and 1960s. Nevertheless, they have, until recently, largely remained outside 
mainstream linguistic theory because much research in this field is underpinned 
by the view that these are “special”’ languages engendered in “catastrophic” 
circumstances providing the child with “heterogeneous” and exceptionally 
“impoverished” triggering experiences (Thomason & Kaufman 1988). In the 
spirit of DeGraff (2005), I would like to present evidence in this chapter which 
tests the hypothesis that, irrespective of their social histories, all natural languages 
— including those that arise in contact situations — should be interpretable within 
conventional grammatical models.” Naturally, given the limitations of space, 
coupled with the fact that the primary objective is to provide an overview of how 
language contact phenomena at the morphosyntactic level can be accommodated 
within grammatical theory, my focus will be restricted to a single framework, 
namely, that advocated within the generative research program (see Chomsky 1981a; 
1981b; 1995; 2000; and 2004).* It is hoped that this approach to investigating 
language contact varieties will have the effect of demarginalising them, and 
opening them up as fields of research for the testing of current linguistic 
hypotheses on the interrelationship between language universals and language 
variation as advocated in Cornips and Corrigan (2005a). 


2 Contact-Induced Morphosyntactic Change and 
the Generative Model 


This chapter will, therefore, focus on reviewing linguistic analyses within a gen- 
erative framework that account for contact-induced morphosyntactic change* 
similar to that illustrated in (1-3) below from research by Aikhenvald (2002) on 
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Tariana, a North Arawak language spoken in northwestern Brazil. Younger 
Tariana speakers, in particular, are widely exposed to Portuguese, which acts as 
the official language of the state. As these comparative examples demonstrate, 
this distinctive contact ecology, in the sense of Mufwene (2001; 2008), has encour- 
aged them to create a new system of relative clause marking quite different from 
the traditional norm (1) which permits the use of the Tariana interrogative pro- 
noun kwana ‘who’ in relatives (2), thereby mimicking the dual function of forms 
like quem ‘who’ as interrogative pronouns/relative markers in Portuguese (3). 


(1) Traditional Tariana (North Arawak: Aikhenvald 2002: 183) 
@ ka-yeka-kanihi kayu-na na-sape 
REL-know-DEM:ANIM thus-REM.P.VIS 3.PL-speak 
‘Those who knew used to talk like this’ 


(2) Younger Tariana speakers (North Arawak: Aikhenvald 2002: 183) 
kwana ka-yeka-kanihi kayu-na na-sape 
who  REL-know-DEM:ANIM thus-REM.P.VIS 3.PL-speak 
‘Those who knew used to talk like this’ 


(3) Portuguese (Aikhenvald 2002: 183) 
quem sabia falava assim 
who knew spoke like.this 
‘Those who knew spoke like this’ 


Although research on language contact from the early twentieth century onward 
readily accepted the view that such ecologies induced lexical and phonological 
borrowing and imposition of the kind described in previous chapters, there 
was a pervasive belief that it produced the kind of grammatical restructuring 
illustrated above much more rarely. While this perspective has largely been 
debunked by the seminal works of Weinreich (1953), and later Thomason and 
Kaufmann (1988), traces of this perception still occasionally surface, as in the recent 
statement by Winford (2003a: 97): 


syntactic structure very rarely, if ever, gets borrowed. In stable, bilingual situations, 
there are very strong constraints against such a change, even in languages subjected 
to intense pressure from a dominant external source. 


Although I disagree with Winford’s view on the extreme rarity of the phe- 
nomenon, he is absolutely right to invoke constraints on the kinds of morpho- 
syntactic features that can be borrowed and the circumstances in which this 
can take place. Indeed, one of the important tenets of all generative approaches 
to borrowing /imposition of this nature, is to safeguard against what Matras (1998: 
282) has aptly termed the “anything-goes hypothesis” with respect to grammatical 
borrowing /imposition in contact settings (see also Sankoff 2002). This orientation 
springs naturally from the underlying principles of any comparative account within 
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the generative tradition, which Haegeman (1997: 1) summarizes as addressing the 
following questions: “(i) what is knowledge of language? (ii) how is this know- 
ledge acquired?” Chomsky (2000: 8) in attempting to answer these questions views 
the standard model of acquisition within this framework as follows: 


We can think of the initial state of the faculty of language as a fixed network con- 
nected to a switch box; the network is constituted of the principles of language, while 
the switches are the options to be determined by experience. When the switches are 
set one way, we have Swahili; when they are set another way, we have Japanese. 
Each possible human language is identified as a particular setting of the switches — 
a setting of parameters, in technical terminology. 


An individual’s internal (I) language is a set of values for these “parameters” 
acquired by the interaction of Universal Grammar (UG) and the “visible” data 
(external (E) language tokens) that constitute the triggering experience and 
which are characterized by underdetermination and the “no negative evidence” 
dictum (see Atkinson 1992; Lightfoot 1999; 2006; Guasti 2002; and Niyogi 2006). 
Thus, Ln could be held to differ parametrically from any subset thereof in the 
value of “null subjects,” directionality, agreement systems, or whatever. Moreover, 
there may be changes in the parameter settings of different stages of L as Roberts 
(2007) and Lightfoot (1999, 2006) and others have argued for the histories of English 
and French. There may even be parametric differences between dialects of the 
same L when viewed in apparent time or across geographical/social space (see 
Henry 1995; Cornips 1998; Kroch 2003; and various contributions to Cornips & 
Corrigan 2005b and Trousdale & Adger 2007). 

A major advantage of applying this type of explanation to the genesis of 
language contact varieties (including those that arose as a result of creolization 
processes) is that notions of uniqueness often invoked in studies describing the 
change and borrowing/imposition of grammatical features in contact situations 
are simply not relevant, as UG principles render all languages and processes of 
acquisition equal. As DeGraff (2005: 575) puts it: 


In recent work, the joint investigation of language contact, language change and lan- 
guage acquisition, suggests that there is not, and could not be, any deep theoretical 
divide between the outcome of language change and that of Creole formation. 


There is no longer any reason to suggest that the learning strategies used by such 
speakers are atypical, nor to argue, as Bickerton (1984) has done, that creoles, for 
example, reflect UG in a rarefied fashion. Moreover, while we clearly know less 
about the “Primary Linguistic Data” (PLD) which were visible to speakers in 
historical contact settings, there seems to be little evidence for the proposal that 
there is a qualitative difference between the acquisition of the first stages of either 
diachronic or synchronic contact varieties and the normal process (see Labov 1975). 
The linguistic competence of contact vernacular speakers and that of L1 acquirers 
is comparable, so it is difficult to argue that the poverty of the stimulus (in terms 
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of underdetermination, degeneracy, and the lack of negative evidence, for ex- 
ample) is more acute in one case than it is in the other. Furthermore, since contact 
varieties often evolve in naturalistic settings after a period of relatively stable 
bilingualism, providing heterogeneous PLD, it would not be surprising to find 
certain parameters selected from both substratal and superstratal languages, 
though one would expect this selection to be constrained in various ways, 
namely, by the principles of UG and by the contact ecology, such as the degree 
of bilingualism. This is determined by E-language factors like the intensity of the 
contact between the different language groups and the extent to which one of these 
may be socially dominant (see Thomason & Kaufmann 1988; Bao 2001; Mufwene 
2001; Thomason 2001; Sankoff 2002; and Winford 2003b). Similarly, as argued for 
in research by Mufwene (2001), Déprez (1992), DeGraff (1999), and Lefebvre (1998: 
349), “the creators of the creole use the parametric values of their own grammar 
in assigning a value to the parameters of the language that they are creating.” 

To this end, the chapter will present two case studies representing formal 
generative analyses of both morphosyntactic imposition (where “the source lan- 
guage speaker is the agent,” in the terms of van Coetsem 1988: 3) and borrow- 
ing (where “the recipient language speaker is the agent,” 1988: 3) from different 
diachronic and synchronic contact situations (see also van Coetsem 2000). 

The first investigation focuses on a well-known macroparameter, known vari- 
ously as “pro-drop” or “null-subject” (see Chao 1981; Atkinson 1994; Fukui 1995; 
Haegeman 1997; Ouhalla 1994; 1999; Radford 2004; Newmeyer 2004; 2005; and 
Roberts & Holmberg 2005). It is thought to be responsible for the following 
cluster of properties (ignoring the difficulty that these characteristics may or 
may not correlate with a single parameter cross-linguistically as discussed in 
Newmeyer 2004 and Anders & Holmberg 2005): (a) the availability of null sub- 
jects; (b) postverbal subjects; (c) apparent violations of that-trace effects; (d) a 
rich verbal morphology. Languages that have a positive value for this parameter 
would include Greek, Irish, and Italian. A number of pidgins and creoles also seem 
not to require subjects to be realized lexically (Romaine 1988; Déprez 1992; and 
De Graff 1996). Languages like Standard English and French, on the other hand, 
have a negative value for this parameter and do not have any of these properties. 

In section 3.1, I will use my own research on a historical contact vernacular to 
illustrate how speakers of a first language (L1) in which the deletion of subjects 
is permitted react to the acquisition of a second language (L2) that precludes 
pro-drop. 

The second case study draws on the work of King (2005) in synchronic contact 
settings. Thus, I review her analyses of a micro-parameter, namely, “Preposition 
Stranding” arising from “wh-movement” amongst speakers from two Canadian 
communities in which Acadian French is used either productively or vestigially. 
Chomsky and Lasnik (1993), Fukui (1995), Ouhalla (1994; 1999), and Radford (2004) 
all provide evidence which supports the proposal that just as languages differ 
with respect to their setting of the pro-drop parameter, they also vary with respect 
to whether or not wh-phrases may be fronted or must stay in situ. Examples (4) 
and (5) below from Ouhalla (1994: 272) illustrate this contrast between languages 
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like Japanese in which wh-phrases do not undergo wh-movement and those like 
English where fronting must occur to avoid ungrammaticality: 


(4) a. John-wa_ nani-o kaimasita ka? 
John-TOP what-ACC bought Q 
‘What did John buy?’ 
. [CP [IP John-wa [VP [DP nani-o] [V kaimasita]]] ka]? 
c. for which thing x, John bought x 
(5) a. What did John buy 9? 
Bs stato Ss eta Sate LU 
. *John bought what? 
c. for which thing x, John bought x 


King (2005) demonstrates that while Preposition Stranding caused by wh-movement 
of this kind in English is perfectly grammatical, it is not so in Standard French, 
so that its occurrence in Acadian varieties, which are in close contact with 
English in the region, requires a principled explanation for which the generative 
framework is ideally suited. 

It would not be appropriate in a review of this kind, however, to suggest that 
the generative model as applied to morphosyntactic transfer in contact settings 
such as these is not without its drawbacks, so some of the major shortcomings of 
the approach are outlined immediately below prior to discussing its successful 
application to data from Early Modern Irish English (EMIE) and Prince Edward 
Island French (PEIF). 

Commenting on Chomsky’s (1989) analysis of do-support in interrogatives, 
Trudgill and Chambers (1991: 295) argue that: “More grammatically sophisticated 
treatments of non-standard dialects are needed, and so is a more empirically based 
approach to grammatical theory.” Attempts to resolve the key questions of the 
generative framework, as articulated in Haegeman (1997) and Chomsky (2002) 
(see above), have led to an account of language, including language acquisition, 
which clearly meets Trudgill and Chambers’ requirements vis a vis sophistica- 
tion. However, the level of abstraction required to determine universal principles 
and delimit the range of variation can appear to over-reduce the complexities of 
the human communication system. Given the goals of the theory it has been expe- 
dient, for example, to assert that a suitable database is the native intuition of an 
ideal speaker-hearer in an ideal environment. In practice, this often means that 
the analyses are not informed by genuine empirical data, which, as Milroy and 
Milroy (1997), Milroy (2001) and Henry (2002; 2005) argue, can have important 
theoretical consequences. Thus, while the paradigm has occasionally acknowledged 
the existence of variation in the form of syntactic arrangements in nonstandard / 
contact vernaculars which run counter to the framework (Carroll 1983; Chomsky 
& Lasnik 1977; Chomsky 1981a; Koster & May 1982), the practice in such early 
generative accounts, in particular, was to ignore these and formulate a syntactic 
theory based on idealizations or standard lects and to invoke ad hoc filters and 
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constraints to accommodate any variation.’ Henry (1995), Cornips (1998), Poletto 
(2000), Adger and Smith (2005), and Barbiers (2005), which focus on nonstandard 
varieties, mark a significant move away from this position and there have been 
great advances too in: (a) diachronic accounts such as those of Kroch and Taylor 
(1997); Lightfoot (1999); Roberts (2007), Roberts and Roussou (2003), and 
Biberauer and Roberts (2005; 2006); and (b) analyses of creole and other language 
contact data, like the research by Déprez (1992) Singh (1995), DeGraff (1996), 
Lefebvre (1998), King (2000; 2005; 2008), Bao (2001; 2005), Bao and Lye (2005), 
Aboh (2006), and Lightfoot (2006). 

Another criticism with respect to the kind of data favored in the paradigm is 
the limited range of languages that have been the focus of comparative analyses 
seeking evidence for parametric variation (see Newmeyer 2004; and Baker & 
McCloskey 2005; 2007). Thus, Kayne (2000) concentrates primarily on the 
Romance language family, while Roberts and Holmberg (2005) explore the fea- 
ture value of “Agreement” (set parametrically as either nominal or non-nominal) 
across Insular and Mainland Scandinavian varieties, which are clearly typologic- 
ally congruent. There have, however, been attempts to take a more robust cross- 
linguistic approach, such as FPukui’s (1995) account of parametric differences 
between English and Japanese. More recently, many papers in Cinque and Kayne 
(2005), as well as Cinque’s (2005) analysis of varying [demonstrative + numeral 
+ adjective] orders in pre- and post-nominal position, also take this much wider 
view. 

Finally, there is the additional problem that although the concept of para- 
meter is a useful device for describing the language-particular properties and cross- 
linguistic differences that are key to generative analyses of morphosyntactic 
transfer under contact, I would agree with Platzack (1996: 375) that the notion 
itself is “rather fuzzy, and most attempts to find constructions in different lan- 
guages correlated by a particular value of a single parameter can be severely 
doubted.” These difficulties were recognized early in the development of the model 
by Chao (1981) in his discussion of null subjects in Brazilian Portuguese, and 
Lightfoot (1989) makes a similar case for German, which does have a rich verbal 
morphology but remains non pro-drop. Moreover, while I will argue later that 
Irish is a null-subject language in the strong sense of requiring that the subject 
position be empty in tensed clauses in which the verb is marked inflectionally, it 
lacks some of the well-known properties of such languages (McCloskey & Hale 
1984, 487-8). Indeed, McCloskey (p.c.) has suggested that where Irish is concerned, 
parametric differences appear to be “a matter of degree rather than reflections 
of binary on/off settings.”° Thus, Munster and Connacht Irish dialects appear to 
share more of the expected pro-drop characteristics than Ulster dialects do. These 
and related issues have received considerable attention recently in the work of 
Newmeyer (2004; 2005) and Roberts and Holmberg (2005). In brief, Newmeyer’s 
position is that parameter approaches “have failed to live up to their promise” 
(2004: 181), while Roberts and Holmberg argue that his views arise from “mis- 
understandings either of theory or of data, are conceptually misconceived, illogi- 
cal or simply false” (2005: 538). The extent to which these diametrically opposed 
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views can be reconciled remains to be seen, though it is hoped that the discus- 
sion in sections 3.1—2 below will cast further light on at least some of the key issues 
that underpin the debate. 


3 Case Studies 
3.1 Language contact and pro-drop 


The properties assumed under this parameter are listed in section 2 above and 
while it was also observed there that although dialects of Irish differ somewhat 
regarding the extent to which they manifest these, there is agreement that, 
broadly speaking, Irish and Standard English differ with respect to the setting of 
this parameter, since the latter (6a) does not permit null subjects, for example, 
while the former (6b) does so quite readily: 


(6) a. *take milk now 
b. pro Tégaim bainne anois 
© take-1.5G milk now 
‘I take milk now’ 


(7a) and (7b) below from Lasnik and Saito (1984: 255) illustrate a so-called 
“Empty Category Principle” (ECP) violation since in (7b) the co-indexed trace ft; 
in the [DP, IP] position is not properly governed because that prevents the trace 
in [SPEC, CP] from antecedent-governing it.’ Thus, in Standard English, extrac- 
tion like this out of an embedded clause introduced by an overt that COMP is 
ungrammatical, but as (8) shows this is not the case in Irish: 


(7) a. Who,* do you think [CP t; [IP t; came]]? 
b. *Who; do you think [CP t; that [IP t; came]]? 


(8) Cé; duirtta [CPt,;a — [IP t;bpdsadh i]]? 
who said you that married her? 
‘Who did you say that married her?’ 


Similarly, languages with positive values for the pro-drop parameter like Irish 
permit postverbal subjects (6b) whereas the English equivalent *take I is unacceptable. 
Null-subject languages also tend to have a full inflectional paradigm for the mark- 
ing of person—number contrasts which, as the inflectional markings on Tégaim above 
indicate, is true of Irish, but not of Modern English. 

The Early Modern English (EME)/Older Scots (OS) varieties, which acted as 
lexifiers in the language contact setting arising from Ireland’s colonization, both 
had negative settings for the pro-drop parameter (Roberts 1993) whereas Irish 
in the same period would have had a positive value for it (McCloskey & Hale 
1984; Miller 1999; Carnie, Harley, & Pyatt, 2000). Hence, within the generative 
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framework, children in this type of language contact situation are placed in a 
linguistic environment with apparently mixed values for the same parameter. 
Interestingly, as Bliss (1979, §195) points out, a number of historical texts written 
in EMIE show instances of sentences such as those in (9-12) with null subjects. 
He states that the phenomenon is not easy to explain but that it might be 
attributed to properties of Irish (1979: 299). In our terms, what appears to have 
occurred is that speakers have set the parameter to the value of their L1 so that 
it will require re-setting in order to become aligned with their L2 English target. 


(9) pro is a pore Irisman pro is a leufter (Bliss 1979: 79), year: 1599 
‘[I] am a poor Irishman, [I] am a leufter’ 


(10) By got, pro ish true now (Bliss 1979: 93), year: 1613 
‘By God [it] is true now’ 


(11) pro Is tink it varm enough (Bliss 1979: 119), year: 1670 
‘[I/you] think [it] warm enough’ 


(12) pro Shal Drink the good ale Whil feather-Cock crow! 
(Bliss 1979: 163), year: 1730 
‘LI shall drink the good ale until the weather-cock crows!’ 


That such resetting has indeed taken place is evident in a twentieth-century 
corpus of over 50 thousand words from speakers in an isolated rural community 
in Northern Ireland that I have digitized in which pro-drop is never attested (see 
Corrigan to appear).’ Moreover, neither the more recent data nor Bliss’s earlier 
IE materials evidence that-trace effects, nor a rich verbal morphology which the 
grammar could have inherited (along with pro-drop) from Irish. We now turn to 
the question: What are the possible causes of the occurrence of this restricted type 
of pro-drop in EMIE and why is it no longer extant? 

Rizzi (1986) has proposed that in pro-drop languages pro is subject to two require- 
ments, which are relevant in a generative account of the EMIE data, namely, (a) 
it is licensed under head-government, and (b) The content of pro is recoverable. 

In the Irish pro-drop examples above, the null pro element in the subject posi- 
tion is viewed as being properly governed by the head of the INFL category. In 
languages like English, however, the head of INFL is believed not to be a licens- 
ing head which can govern the subject position (Roberts 1997: 273) and this is 
why constructions like (7b) above are ungrammatical. 

Moreover, Rizzi’s second principle can only be met in languages which are 
morphologically rich, because it is only in these that the governing verb has 
the relevant phi-features attached to it that permit it to be co-indexed with the 
phonologically null pro lexical subject. 

Since Irish has an obligatory V-to-INFL raising rule which creates finite VSO 
word orders (see Carnie & Guilfoyle 2000), it is possible for such a language to 
have empty subjects because the subject position is properly governed by the 
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verb in INFL (McCloskey 1991, §5.2). Moreover, pro subjects in Irish are easily 
identified with their governors because the language displays the agreement affixes 
in (6b) above which are missing in English. If the EMIE speakers were producing 
phonetically null subjects, then, at some level, their grammars were satisfying Rizzi’s 
first requirement. In other words, they may have assumed that their contact ver- 
nacular could be pro-drop as a result of transferring the Irish value in which INFL 
is indeed a licensing head for government. 

Some corroboration for this assumption comes from: (a) the presence of null 
subjects in EME/OS in contexts where they are no longer permitted; (b) the greater 
freedom of syntactic word order in EMIE and the productivity of verb-raising 
processes, and (c) a change in the categorial status of AGR in English between 
the Late Middle and Early Modern periods. With regard to (a), both Mustanoja 
(1960: 138) and Visser (1963-73, §§3ff.) record a number of sentence types in the 
history of English that also permit null subjects. The majority of these involve 
contexts where the sentence is “sufficiently clear without the pronoun” (Visser 
1963-73: 4). However, both Visser (1963-73) and Johannesson (1985) have com- 
mented on the existence in the fifteenth and sixteenth centuries of a process of 
variable subject deletion in topic-comment structures like the following from 
Hanham’s (1975: 226) edition of the Cely letters: As for Wylliam Dalton, © ys yett 
yn Hollond, which may have also served to reinforce the Irish pro-drop parameter. 

Furthermore, the existence of (b) is corroborated by the findings of EMIE texts 
which evidence various types of verb-raising rule, producing VSO word orders 
as in (13-16) below: 


(13) is none of you strong enough... to owercome him 
(Bliss 1979: 65), year: 1698 


(14) Tis come bouryin you are de corp... of a verie good woman 
(Bliss 1979: 6-7), year: 1698 


(15) Is de old hawke have de old eye (Bliss 1979: 86), year: 1670-5 
(16) here is Nees beg dy Par-doon (Bliss 1979: 91), year: 1689 


Decontextualized, structures like (13) and (15) appear to be straightforward inter- 
rogatives, but these utterances (like (12) above) conclude (in the relevant texts) 
with exclamation marks indicating that they have declarative rather than inter- 
rogative status. Focus phrases such as (14 and 15) are also relevant here in that 
Filppula (1986; 1990; and 1999) finds that these are common in contemporary IE 
as a result of the topic-prominent nature of the substrate and the fact that the 
word order strategies of early English were more variable than they are now. The 
verbs that occupy the topic slot in each case are finite and the lexical (as opposed 
to expletive) subject follows. Moreover, it can be assumed that the structures 
themselves are derived by some raising process since the criteria for syntactic 
movement in these contexts, namely, the presence of focal stress and absence of 
resumptive pronouns (Tsimpli 1990: 239) are met. 
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Similarly, constructions such as those in example (14) are clearly derived by 
the verb-raising process associated with another well-known parameter, namely, 
“verb-second” (V2). In languages with a positive value for this parameter, there 
is a requirement that the finite verb should immediately follow the first constituent 
(Travis 1991). Certain dialects of EME/OS had this setting as can be seen from 
sentences like Heere lith the millere (Chaucer: The Miller’s Tale, ed. Robinson 1978, 
L.4212). These were productive prior to the re-setting of the V2 parameter as a 
result of syntactic change (described more fully below), which eventually precluded 
raising of this sort in Modern Standard English (see Roberts 1985; 1993; 1995; 2007; 
Kroch 1989; 2003; Kroch & Taylor 1997; and Lightfoot 1999). Thus, as Bliss (1979, 
§194) notes regarding EMIE: “The initial position of the verb in Irish is rarely 
imitated in Hiberno-English, though it does occur” and as we have seen, this 
phenomenon may have been encouraged as much by the syntactic movement 
processes of EME/OS as by the obligatory V-to-INFL raising of Irish. 

Fact (c) mentioned above could also be construed as a further reason why EMIE 
speakers may have assumed that the subject position was properly governed and 
hence could license pro. This concerns the feature specification of INFL in early 
forms of the superstrate, which, as noted above, scholars have argued changed 
between the late Middle and Early Modern periods. Prior to the loss of INFL as 
a proper governor with the ability to assign thematic roles, all verbs could appear 
in V2 and thus interact with negation, and other properties of INFL, notably, 
T and AGR. In late ME a series of changes occurred constituting a reanalysis 
whereby nonmodal verbs lost the ability to move up to the V position in INFL, 
thus implying a parametric resetting in Early Modern English from a “morpho- 
logical” to a “syntactic system of agreement” (Roberts 1985: 32; and Warner 1993, 
for example). The net result of the new value was that the INFL position was 
appropriated by the modals and auxiliaries which were base generated in INFL 
(now a nonthematic position) and no other verb was permitted to occur here. The 
reason being that they required to be associated with thematic positions in order 
to assign the theta-roles of [+NOM] and [+ACC] to their subjects in [SPEC, 
AGRs] and objects in [SPEC, AGRo], respectively. Hence the V-to-INFL rule was 
eventually made redundant along with V2-type clauses, which are reminiscent 
of the EMIE structures in (13-16). 

This change was not complete until the middle of the seventeenth century, 
so we know that the early IE speakers will have been exposed to an amount of 
variability in the superstrates which acted as their trigger. Not all lexical verbs 
will have been reanalyzed, since such processes, no matter how “catastrophic,” 
will not happen overnight.'” Moreover verb-like elements continued to occur in 
INFL and it may not have been obvious that these auxiliaries had no theta-roles 
to assign and hence could not properly govern the subject. This seems to be a 
reasonable assumption on two grounds: (a) counter-evidence was provided in 
the Irish PLD which maintained a V-to-INFL operation where the raised V-INFL 
complex could assign theta-roles to the relevant subject and object DPs, and 
(b) EME/OS word order was characterized by persistent syntactic freedom and 
the V2 constraint. Notice, too, that in the examples which Bliss cites with pro-drop 
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subjects, it may not be accidental that these contain forms associated with what 
Roberts (1985) terms the “morphological agreement system,” namely, the modals 
Shal and Be which have dual membership of the auxiliary and lexical verb classes 
and can function as both dependent and head. 

Having provided a tentative explanation for the occurrence of nonlexical sub- 
jects in early Irish English, how might we account for the absence of this phe- 
nomenon in the contemporary Irish English corpus? Assuming Rizzi’s (1982; 1986) 
“minimal hypothesis,”"' that when constructing a grammar, the child starts out 
with the parameter which offers the most restrictive outcome, then the positive 
setting for the pro-drop parameter, namely “add INFL to the set of proper gov- 
ernors,” will not come about unless the child’s triggering experience includes 
exposure to sentences with null subjects as an Irish child’s does. If, in Rizzi’s terms, 
“the decision to add a given head to the class of licensers” (Rizzi 1986: 55) is no 
longer justified by the PLD, then it is to be expected that the child creating 
the contact vernacular will eventually adopt the negative setting for the pro-drop 
parameter as exposure to Irish is withdrawn by the early twentieth century due 
to community language shift (see Corrigan 1999; 2003; Kallen 1997; and Hickey 
2004). Another possible reason for its demise is the fact that the syntactic freedom 
in the word orders that are permitted in the target language has continued to be 
curtailed since the Early Modern period with the result that the V2 constraint, for 
example, becomes increasingly residual in root clauses. Thus, if we accept that 
this position is plausible, then the loss of pro-drop in Irish English is also a 
consequence of the parametric setting for the English agreement system being 
switched to one that is syntactic rather than morphological, which in turn causes 
V-to-INFL movement to be prohibited for main verbs and causes modals and 
auxiliaries to be base-generated in INFL and V2 to be lost. Moreover, the UG require- 
ment on pro, articulated as Rizzi’s second condition above, that identification 
between the INFL head and the lexical subject in [5[PEC, AGRs] is mandatory, 
cannot be satisfied anymore in the developing contact vernacular since the Case- 
marking properties of the lexifier were, according to Allen (1997: 77), “highly syn- 
cretic by the year 1100, and although the old category distinctions still remained, 
syncretisms had reached such a level that it was probably inevitable that these 
distinctions would soon disappear.” That Allen’s observation was correct is 
echoed by Roberts (1985: 43; 1993) who claims that by the Early Modern period 
“language learners at this time were faced with a highly impoverished morpho- 
logical agreement system” which has since reduced even further (Haegeman & 
Guéron 1999: 96), so that Haegeman (1991: 416) observes that the Case-marking 
system of Modern English is “too poor to enable one to identify person and 
number of the subject on the basis of verb forms only.” Crucially, our findings 
with regard to the observed differences in the settings of the pro-drop parameter 
between Irish/EMIE and Modern English/IE comply with proposals made by 
Roberts (1985: 33): 


1 If there is rich agreement, there will be Verb-movement to INFL [ie. 
Irish/ME]. 
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2 If there is pro-drop, there will be Verb-movement to INFL [i.e. Irish/EMIE]. 
3 If there is little or no agreement, there will be no Verb-movement [i.e. 
Modern English/IE].”” 


3.2 Language contact, wh-movement, and 
preposition stranding 


King (2005) examines preposition stranding data from two regions of Prince Edward 
Island (PEI), Canada, namely, Evangéline and Tignish. Members of the former 
community are either monolingual in French or bilingual in French and English, 
their French being dominant. The population of Tignish, by comparison, are ex- 
periencing language attrition from French to English. Although this not unex- 
pectedly leads to higher numbers of English-origin prepositions in the Tignish 
dataset, King reports that “the actual patterning of preposition usage was found 
to be the same for the two communities,” nor were these used to mark social dif- 
ferences (King 2005: 240). Thus, wh-movement in both types of PEIF can lead to 
prepositions being stranded as in (17) below, which is reminiscent of the English 
example demonstrating wh-movement in (5a) in section 2 above: 


(17) Quoi ce-que tu travailles dessus G? 
what that you work on? 
‘What are you working on?’ 


In this PEIF example, as is true also of the English gloss, the movement leaves 
the prepositions dessus/on stranded. While both standard and vernacular varieties 
of French permit structures containing so-called orphan prepositions in certain 
types of clause (when, for instance, they can be recovered via “discourse linking,” 
as King (2005: 242) notes regarding (18) below), there are thought to be restric- 
tions on the prepositions involved. De ‘of’ and 4 ‘to’ are generally ruled out in 
such constructions in Standard French and indeed in Canadian varieties such as 
Montreal French. 


(18) et vous coulez avec 
and you sink — with [it] 
‘and you sink with it’ 


Hence, PEIF seems unusual in permitting (19) and (20) below which contain these 
prepositions and also seem to involve movement (though constrained by sub- 
jacency so that extraction is not too “long,” i.e. across at least two bounding nodes 
such as TP or DP as in the ungrammatical (26) below):’ 


(19) le gars que jete parleded 
the guy thatI you talk of 
‘the guy I am talking of’ 
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(20) le garsquej’ai donnéla joba OG 
the guy that I have given the job to 
‘the guy I’ve given the job to’ 


King (2005: 246) links the acceptability of Preposition Stranding in this variety 
with the frequency of English-origin prepositions, since there appears to be a 
correlation between their occurrence to the extent that it is not just the preposi- 
tions themselves that have been borrowed but also a structural feature of English 
prepositions more generally, namely, that they can properly govern the empty 
position. This would, therefore, be similar to the transfer noted in section 3.1 from 
Irish to EMIE with respect to INFL as a proper governor of the null subject, though 
of course the direction of transfer is different in each case. In the PEIF scenario, 
the transferred feature would have been to extend the ability of all prepositions 
to govern the @ element included in (19) and (20) above. 

King (2005) then goes on to query whether Preposition Stranding in English 
and PEIF are so alike as to be a true case of transfer in the sense used here by 
first considering the restrictions on grammaticality in each case, noting that while 
English (21) is unacceptable, similar constructions (20—22) have actually been judged 
grammatical by speakers in her sample from both Evangéline and Tignish. 


(21) *Who did Pugsley give a book yesterday to? 


(22) Quoi ce-quetu as_ parlé hier a Jean de? 
What that you have spoken yesterday to John about 
‘What did you speak yesterday to John about?’ 


(23) Quoi ce-quetu as parlé a Jean hier de? 
What that you have spoken to John yesterday about 
‘What did you speak to John yesterday about?’ 


(24) Quoi ce-quetu as parlé a Jean de hier? 
What that you have spoken to John about yesterday 
‘What did you speak to John about yesterday?’ 


These data appear to indicate that “the structural relationship between the verb 
and the preposition found in English is not relevant in PEI French” (King 2005: 
248), which is not unexpected given the fact that English generally has tighter 
constraints on adjacency than French does. 

One analysis of these contrasting levels of acceptability outside the generative 
enterprise might simply be that the calquing of English-origin prepositions in PEIF 
as a result of ongoing language contact has activated a new system in this ver- 
nacular which allows more types of preposition to be stranded and also permits 
such movement to take place in a wider range of construction types than is pos- 
sible even in the English model. 

However, such an explanation really does seem to be a good example of the 
kind of approach to morphosyntactic transfer that I described in section 2 as 
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being underpinned by Matras’ “anything-goes hypothesis” (1998: 282). In the first 
place, it does not account in a principled way for the lexical dissimilarities 
observed between Canadian varieties, which King (2005: 237) describes in gen- 
erative terms (after Chomsky 1993) as “differences in the morphosyntactic prop- 
erties of different lexical items.” Secondly, it presupposes that the transfer is not 
at all constrained by features such as the framework’s “subjacency parameter” 
(Roberts 1997: 197-8). Hence, the grammaticality contrasts between (25) and (26) 
below also indicate that this overgeneralization analysis is suspect, since so-called 
“short” wh-movement (25) is permitted in PEIF, but “long” wh-movement is imposs- 
ible as this would involve movement from a complex DP (26), contrary to the 
parametric value for subjacency set for this variety: 


(25) Qui; ce-que tu. connais 0? 
Who that you know 


(26) *Qui,;ce-que [TP tu connais [DP le projet a 9;?]] 
Who that you know the project of 
“Who do you know the project of?’ 


By adopting the technical apparatus of the generative framework, King (2005) has 
been able to demonstrate that this phenomenon is not overgeneralization per se 
but that it appears to be a case of “lexical borrowing having syntactic effects 
in the recipient language” (2005a: 249). What is of interest to the goals of this 
chapter is that this research also demonstrates the importance of taking a holistic 
approach to the grammar which takes account of the formal, structural properties 
of the two languages in contact, so that analyses are suitably constrained. 


4 Conclusion 


The case studies reviewed here were offered as an attempt to show that genera- 
tive analyses of morphosyntactic change in language contact situations can offer 
a constrained and sophisticated approach to the genesis of these vernaculars which 
takes due account of the heterogeneous triggering experience and does not treat 
them as unnatural languages. Indeed, it accords rather well with the view 
expressed in Crowley (1991: 282) that “what is most likely to ‘survive’ in a radic- 
ally altered contact language is a set of features combining substratum and super- 
stratum features, as well as other features that develop independently, for a variety 
of reasons, including universal pressures.” The particular advantage to contact- 
induced morphosyntactic change that this paradigm has over purely substratist 
approaches, for example, is that it provides a powerful tool for the analysis of 
syntax that goes beyond the mere comparison of surface strings. In the case of 
alleged differences between the lexifiers, substrates, and EMIE/PEIF, for example, 
it has enabled us to test whether or not these are deep-seated /macroparametric, 
i.e. they explain clusterings of properties within the grammar as in the setting of 
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the pro-drop parameter, or more peripheral/microparametric like the borrowing 
of English-origin prepositions in PEIF. Moreover, the discussion of the raising 
processes involved in each case identified properties of the vernaculars, their lexifiers 
and substrates which at first blush appeared to be quite distinctive, but could, in 
fact, be seen to be predictable by universal and language-particular features, which 
in essence solves the acquisition problem often faced by accounts outwith this 
framework (see Chomsky 2004). 

Finally, although I would agree with Breitbarth, Lucas, and Willis (2008) that 
explaining contact-induced morphosyntactic change is still generally perceived 
to be a challenge to the generative model in certain respects (since contact is 
normally between adults rather than children as demonstrated in the two case 
studies sketched here), the construction of a grammar is a uniform process from 
a biolinguistic perspective and, as such, grammars that have undergone change 
from one generation to another as a result of contact are not different in kind to 
those that have been affected by internal change to the PLD. The grammars of 
EMIE or PEIF, in this view are, therefore, neither exceptional in the sense of 
De Graff (2005) nor, crucially, are they beyond the proper scope of formal theories 
of syntactic change. 


NOTES 


This paper has benefited considerably from discussion with participants of the Language 
Contact Workshop at the 31st GLOW Colloquium, March 29, 2008, Newcastle University, 
UK. Particular thanks are due to Enoch Oladé Aboh, David Willis, Donald Winford, and 
Rita Manzini. 

1 Aitchison (1980), for instance, has proposed that historical linguistic change cannot be 
explained without a detailed examination of creologenesis and Bickerton (1984 and 
elsewhere) has made even more far-reaching claims. See Mufwene (2001), DeGraff (2005), 
and Schneider (2007) for counter-claims. 

2 See Singh (1995) for one of the earliest treatments I am aware of demonstrating the 
importance of harnessing the devices available within formal generative theories 
for explaining contact phenomena (using evidence from English and Hindustani in 
India). 

3 There is no implication intended that other frameworks are somehow less adequate 
for this purpose. Indeed, proponents of grammaticalization theory, such as Heine and 
Kuteva (2005), and treatments of contact phenomena within optimality theory, lexical 
functional grammar, and linguistic typology (as demonstrated by Thomason 1997; 
Bresnan 2000; and Sharma 2001 as well as the collections of Aikhenvald & Dixon 2006; 
2007; Haspelmath et al. 2001; Matras et al. 2006) have made vital contributions 
not only to the description of contact settings across the world’s languages but 
also to the principles of contact-induced change and the factors that facilitate the 
diffusion of morphosyntactic characteristics from one language to another. I focus 
simply on formal generative frameworks, simply because these are the ones with which 
Iam most familiar. See also, Baker and McCloskey (2005; 2007) and Newmeyer (2005) 
on the important interrelationship between typology and generative linguistics. 
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Various terms have been used to describe and differentiate between types of contact- 
induced change in the literature. The distinctions recently suggested by Winford (2003b) 
between ‘interference /imposition’ and ‘borrowing’ as well as the related ‘convergence’, 
‘reanalysis’, ‘relexification’, ‘substratum influence’ and ‘transfer’, have been critical to 
our understanding of the differences ‘between the processes and outcomes characteristic 
of each’ (Winford 2003b: 129). As such, I apply the notions ‘imposition’ and ‘borrowing’ 
in the sense of van Coetsem (2000) and Winford (2003b) below. The term ‘morpho- 
syntax’ is also used here to avoid unnecessary complexity as a catch-all term for any 
kind of grammatical change whether that involves the syntax proper (i.e. the order 
of meaningful elements in a clause as VSO, SVO and so on) or grammatical forms 
which can carry inflections that operate at the interface between morphology and 
syntax, such as tense, mood and aspect markers (see the papers in Mereau 1999 for 
further clarification of this distinction). 

It is important to bear in mind, of course, that the theory was traditionally never intended 
to explain human communication more widely, which is why it excludes performance 
data and any potential interaction between the syntactic module and others (particu- 
larly pragmatics). 

Ackema (2001) provides a wide-ranging discussion of such complexities regarding 
inflectional verb movement which are not generally assumed in the literature. 

This phenomenon is generally termed the “that-trace effect” for this reason. 

The subscript “7” here is the usual convention to demonstrate that although extraction 
has taken place (hence “@”), the site of the extracted element and the moved wh-phrase 
can still be identified with one another. 

Hickey (2007: 143-5, 162-4) shows similar findings for his own A Collection of Contact 
English in which VSO word order never occurred in the English speech of Irish native 
speakers. 

On the continuing debate regarding gradual versus abrupt diachronic change, see Kroch 
(1989) and, more recently, Lightfoot (1999; 2006). For evidence that will, can, and to a 
lesser extent may retain main verb properties later than 1750, see the discussion in Warner 
(1983; 1993). 


11 Also termed the “subset principle/condition” in other formulations, see, for example, 
Ouhalla (1994). 

12 As already noted, counter-claims regarding this parameter are well known in the 
literature but this analysis is nonetheless useful in this context as it exemplifies key 
aspects of the generative approach to language change. 

13 This is a complex area of grammar that there isn’t space to properly explore here. 
Vasishth and Lewis (2006) provide a good overview of the issues. 
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6 Computational Models and 
Language Contact 


APRIL McMAHON 


1 Introduction 


Early in her first chapter, Thomason (2001: 10) asks the question “Where is language 
contact?” The answer is straightforward: “Language contact is everywhere.” As 
Thomason continues (2001: 12), “language contact is the norm, not the exception. 
We would have a right to be astonished if we found any language whose speakers 
had successfully avoided contacts with all other languages for periods longer than 
one or two hundred years.” 

If language contact really is virtually universal in space and time, then it is all 
the more amazing that it has tended to be marginalized from historical linguis- 
tics; or at least from that part of the discipline which focuses on the grouping of 
languages into families, and on the reconstruction of unattested protolanguages. 
The family tree model has typically found no place for pidgins, creoles, and mixed 
languages. For other languages, which may have undergone contact but have not 
entirely been created through it, the family tree model is only workable if the signs 
of borrowing or other-language influence can be tracked down and removed so 
that the “true” history of each language system can be established, without these 
additions and excrescences. 

Yet we know contact can affect not only the lexicon, but also the phonology, 
morphology, and syntax; to quote Thomason again, “all aspects of language struc- 
ture are subject to transfer from one language to another, given the right mix of 
social and linguistic circumstances” (2001: 11). Regardless of our perspective on 
language contact, linguists therefore need to engage with the central question of 
whether linguistic features which owe their existence to descent from an ances- 
tral variety or protolanguage within a family can be distinguished from those which 
have been borrowed or remolded on the basis of another language. This is 
important for linguists who are interested in variation and change, and in how 
the former becomes the latter: contact is self-evidently one source of variation, 
and therefore contributes to language change. If we accept Thomason’s claim above, 
linguists focusing on language contact will wish to identify the particular “social 
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and linguistic circumstances” (2001: 11) which facilitate or block certain kinds of 
contact-induced change. Typologists may ask what kinds of change can take place 
only in contact situations, as opposed to those which can happen “naturally,” with- 
out any external input from another system. Theoretical linguists might priori- 
tize analyses of “native” processes of phonology or syntax in the language they 
are studying, while taking a different view of the status of synchronic processes 
which are transparently derived from another language: in short, they might be 
less concerned if their theoretical model copes less well with the latter. And as 
we have seen, linguists interested in language classification commonly also wish 
to identify and remove loans — as Kessler (2001: 5) puts it: 


It is probably fair to say that for most researchers, borrowing has been considered 
noise in the system, a perturbation that makes it more difficult to discover a neat 
underlying tree. For them, loans are to be sought out and discarded before the real 
work can proceed. 


From the point of view of this chapter, all these possible motivations for iso- 
lating borrowed features from inherited ones are potentially relevant — but what 
matters most is simply the growing consensus in linguistics that identifying and 
analyzing contact-induced changes is important, and hence the increasingly rel- 
evant question of how that can be achieved. In many cases, this will require the 
detailed investigation of specific linguistic and sociolinguistic data by linguists 
who know the languages concerned intimately. However, lack of data, or the 
considerable time depth involved, will sometimes leave uncertainties; and indi- 
vidual experts may still disagree, so that there is scope for novel methods to 
be applied to validate existing findings, or to go beyond these in cases where 
traditional methods are hampered or inapplicable. Quantitative and computational 
methods are currently being developed in many areas of historical linguistics, 
from language family and subfamily grouping (Ringe, Warnow, & Taylor 2002; 
McMahon & McMahon 2003) to dating of protolanguages (Gray & Atkinson 2003). 
The purpose of such phylogenetic approaches is to apply explicit, objective, and 
replicable methods of comparison to language data, preferably from databases 
which are available for other research groups to test and use. Ideally, these com- 
parison programs or metrics go through many repetitions or iterations, or are tested 
statistically, to establish our degree of confidence in a particular outcome. Results 
may then be visualized as family trees, typically derived through software 
designed for other domains (commonly but not exclusively population genetics). 
However, rather than choosing a preferred family tree, sometimes on unclear, 
personal, or intuitive grounds, these programs often generate many thousands of 
possible trees, homing in on a subgroup of trees which are most consistent with 
the data. In more recent work, trees are increasingly supplanted by networks, which 
display many possible trees simultaneously, noting cases where relationships 
between languages are complex or unclear. These quantitative approaches to lan- 
guage also allow comparison with data from genetics, archeology, and anthro- 
pology, giving potentially powerful insights into cultural evolution and the 
relationships between human populations. For recent overviews of activity in this 
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developing field, see McMahon and McMahon (2005), McMahon (2005), and 
Forster and Renfrew (2007). 

In the rest of this chapter, focusing on one area of the grammar at a time, 
I will illustrate how such quantitative and computational methods can provide 
insights on language contact and new opportunities to identify its effects, as well 
as noting areas where research is just beginning. 


2 Loanwords and the Lexicon 


Much of the phylogenetic work to date on language groupings and language change 
has focused on vocabulary, and particularly on lists of basic vocabulary items. 
Although Embleton (1986; 2000) notes that comparative work of this sort predates 
Morris Swadesh, these are almost universally known as Swadesh lists, and 
include common meanings relating to universal human features and experiences, 
including body parts, natural phenomena like the sun or water, and small 
numerals. Comparison of Swadesh lists, often called lexicostatistics, involves 
establishment of the most common or basic word in each language for each of 
the 100 or 200 meanings in the list; the items are then considered to assess which 
pairs across two languages are cognate, or appear to come from a single common 
ancestor form. The number of cognates is tallied up and used to establish 
whether the languages in question are likely to be related, and perhaps how closely. 
In classic lexicostatistics, the key to identifying cognates is the prior application 
of the comparative method (see Rankin 2003; Harrison 2003; and Campbell 2003), 
which establishes recurrent correspondences between sound patterns across 
languages: in the absence of this information, comparison would be by surface 
similarity, which could not rely on excluding loans or indeed chance resemblances. 

Kessler (2001), in his own work on word lists, is somewhat unusual in expli- 
citly not seeking to distinguish borrowings from shared inherited items: “whether 
language elements share certain properties because they are inherited from a 
common ancestor language, or whether they share them through borrowing, the 
languages and the elements in question can be said to be historically connected” 
(2001: 5; emphasis original). This choice is made for two reasons. First, Kessler’s 
study is primarily concerned with distinguishing historical connection (regard- 
less of the type or source) from chance, and specifically with testing historical and 
statistical methods to assess whether they are sufficiently robust to prove histor- 
ical connection, or whether chance similarities trip them up. Undoubtedly, dis- 
tinguishing meaningful similarities from chance ones is an important goal of any 
quantitative investigation. However, McMahon and McMahon (2005: 92) argue 
that this alone does not merit ignoring the difference between the two sets of 
factors which might lead to meaningful similarities: 


Kessler is combining two different contributions to history which we might want to 
keep separate: it should be possible to agree that borrowing and common ancestry 
are both important, without having to go to the extreme lengths of collapsing the 
distinction. 
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Distinguishing contact-induced change from other types might be important 
typologically and theoretically. At the very least, since it has been suggested that 
external and internal factors might lead to different types of change, or different 
mechanisms for change (see Labov 2007 on diffusion versus transmission, for 
example), it would seem sensible to try to approach them differently to start with, 
until we can assess whether this is actually the case. 

The second reason for Kessler’s combination of borrowing and inheritance into 
historical connectedness, however, is his contention (2001: 109) that “most of the 
time one really is in the dark about whether words are loans or not. At some point 
in prehistory, those questions become unanswerable.” It is trite but true to say 
that much depends on the historical circumstances: if we have lots of data about 
the donor and recipient languages, ideally over a long historical span, and if our 
external historical and social knowledge confirms that there was interaction of 
speakers and perhaps the domains in which it is likely to have occurred, then 
establishing borrowings either by their linguistic shape, or their appearance in 
the written records at a particular time and context, is relatively straightforward. 
Thus, we can establish that there are many loans from French into English: so, 
when Chaucer in the Knight’s Tale describes “a verray, parfit gentil knyght,” all 
of verray, parfit, and gentil are French - though occasionally, as with canif and knife, 
we find a borrowing in the other direction, from English into French. Sometimes 
we even find contemporary grammarians identifying and discussing loans, or 
contemporary dramatists and essayists ridiculing and parodying them (though 
admittedly that tends to be the icing on the historical linguistic cake). 

However, when we are using basic vocabulary lists, not all historical linguists 
would even agree that we need to seek out loans with our customary care. As 
Kessler (2001: 109) notes, “there is an idea in the air that it has been proved that 
words on the Swadesh 100 list are so rarely borrowed that it is safe to ignore the 
problems of loans when using that list - maybe there will be one or two borrowings, 
but surely not enough to skew the overall results.” To some extent, Kessler sup- 
ports this view: 


It seems intuitively likely that the words in the Swadesh lists are, on the whole, less 
likely to be borrowed than other words one might pick at random from the dic- 
tionary. There does not seem to be any reason why a language would borrow words 
like egg and dog from some other language, but one can think of compelling reasons 
why they might borrow words along with novel cultural concepts such as telephone 
and miniskirt. (Kessler 2001: 103) 


Exceptions, however, are not hard to come by — English notoriously has borrowed 
the third person plural pronouns they, their, and them from Norse, along with get, 
skin, and root, and a range of other basic items — including egg. Indeed, Kessler 
(2001: 109) wonders “how the idea could have got started, when even English 
has a large number of loans in these lists.” Lest English (with a total of 31 loans 
identified in Kessler’s work across the 100- and 200-item Swadesh lists) be 
argued away as particularly odd or special, Kessler identifies even more items 
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borrowed into the Albanian lists (a total of 41, with 35 of these from Latin), 27 
into French, and 22 into Turkish. Embleton (1986), for the 200-item Swadesh list 
alone, identified 12 loans from North Germanic into English, 12 from French into 
English, and 15 from Dutch into Frisian. The conviction that borrowing is a neg- 
ligible factor in Swadesh list comparisons dies hard, however: in connection with 
an interesting recent claim that languages may change in “punctuational bursts,” 
where lexical replacement is particularly frequent around the point where two 
languages split in a tree, Atkinson et al. (2008: supplementary material 3) note 
that “our inference model assumes vertical transmission of cognates (down a 
lineage) and does not explicitly account for the borrowing of cognates between 
languages. We expect the rate of borrowing in the Swadesh vocabulary to be low.” 
Atkinson et al. nonetheless attempt to test whether borrowing might be respon- 
sible for the appearance of punctuations in their data by carrying out simulations, 
though where the actual borrowing rates in the language families they examined 
(Indo-European, Bantu, and Afro-Asiatic) are not known, this can be only 
suggestive. 

These findings raise two main questions if we wish to make progress on iden- 
tifying loans in basic meaning lists. For the most part, loans have been discov- 
ered by individual linguists scanning through lists on the basis of considerable 
knowledge of the histories of the languages concerned. Is there any way of 
automating this process, or of developing algorithms for seeking out loans? And 
what should we do if we do not have sufficiently secure knowledge of the lan- 
guage histories to be clear about the regular correspondences from which loans 
might depart — in the extreme case, what could we do if we wanted to compare 
meaning lists for languages which may or may not be related? 

One of the first attempts to identify a formula for the identification and exclu- 
sion of loans is in Embleton (1986). Embleton proposes a borrowing parameter 
which should be incorporated into the calculations for Swadesh list comparisons, 
of b/k,, where b is borrowing into x, for each of its neighbors, k. Unfortunately, 
as McMahon and McMahon (2005: 91) observe, b can only be calculated by 
adding up all the actual borrowings already identified in the list, and the value 
does not generalize to other neighboring languages, which will have their own, 
independent borrowing rates for each pairwise comparison, presumably accord- 
ing to Thomason’s (2001: 11) “mix of social and linguistic circumstances.” 

Although it has not been possible to develop an algorithm for diagnosing loans, 
other computational approaches to the problem have been proposed. For ex- 
ample, Ringe et al. (2002), in their computational cladistics project, take a “perfect 
phylogeny” approach to the subgrouping of 24 Indo-European languages. They 
use a fixed list of 333 lexical characters, chosen specifically to provide as much 
information as possible about Indo-European language history, rather than for 
their basic or unborrowable qualities, though there is nonetheless considerable 
overlap with the Swadesh lists (Ringe et al. also use 22 phonological and 15 mor- 
phological characters; see sections below for further discussion). Languages are 
then grouped on the basis of the number of character states they share. Ringe 
et al. seek the “perfect phylogeny,” or the single, true tree which shows no 
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discrepancies at all with the data. The ideal tree turns out to be a fictional beast 
(though some come much closer than others); but from our point of view, the 
methodology is interesting partly because of what it tells us about the characters 
which are persistently discordant with the better trees. Wang and Minett (2005: 
123) describe Ringe et al.’s approach as follows: 


determination of the optimal position of each sub-group is undertaken by seeking 
the topologies — there might be more than one — that are compatible with the great- 
est number of characters. The remaining, incompatible characters are viewed as hav- 
ing been subject to nongenetic processes, such as borrowing, and are not used to 
determine the optimal topologies. 


Germanic (which we have already seen has been found to have a considerable 
volume of borrowing by Kessler (2001) and Embleton (1986)) is particularly 
plagued with incompatible characters in Ringe et al.’s (2002) analysis, though 
direct comparison between these results and those from Kessler and Embleton is 
difficult because the word lists used are not identical. Furthermore, although the 
fact that Germanic shares lexical character states with a wide range of other 
subgroups is suggestive of borrowing, this is not the only possible explanation 
for the Germanic character incompatibilities, which could also reflect the existence 
of a dialect continuum for part of the history of this subgroup: as Ringe et al. 
(2002: 111) conclude, “it is clear that the development of Germanic exhibits 
some characteristics which cannot realistically be modelled with a ‘clean’ evolu- 
tionary tree, but it is not clear what historical developments have given rise to 
those anomalies.” 

Nakhleh, Ringe, and Warnow extend this perfect phylogeny work, using the 
same languages as Ringe et al. (2002), and a subset of the same characters, 
excluding those that are polymorphic or clearly show parallel development. 
Their aim is in part “to address the problem of how characters evolve when diverg- 
ing language communities remain in significant contact” (2005: 384); one corol- 
lary of such situations is that linguistic behavior should not be entirely treelike, 
so that networks rather than trees are preferable in visualizing the results. The 
Nakhleh, Ringe, and Warnow networks are essentially trees, but with added 
“edges,” which are potentially bidirectional, linking languages or varieties which 
are in a borrowing relationship. This way of handling contact reflects Nakhleh 
et al.’s conviction that vertical, ancestor to daughter inheritance of features is the 
normal case, while contact is exceptional: “Our analysis shows dramatic support 
for the claim that the diversification of IE was largely treelike: almost all (95%) 
of the characters evolve down our proposed genetic tree, and we need only 
three additional contact edges to explain all the data” (2005: 391). Two of these 
contact edges involve Germanic; the third involves Proto-Italic and Proto-Greco- 
Armenian, but this is much more weakly supported. Again, however, Nakhleh 
et al. select the network showing these likely contact episodes over alternative 
networks suggesting contact between, e.g., Slavic and Tocharian on the grounds of 
plausibility given our knowledge of historical population contacts and migration 
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patterns, so that while this work includes a component of automatic generation 
of network options, the prioritization of networks remains a matter of linguistic 
decision-making. 

Wang and Minett (2005) instead use the concept of skewing of lexical similar- 
ities, which they define using the example of a study by Hinnebusch (1999). 
Hinnebusch compared three Nilotic languages, noting that Samburu and Nandi 
shared 9.9% of the 200-meaning list, while Masai and Nandi shared 15.7%, and 
had moreover been in close contact for a long period. Wang and Minett (2005: 
124) observe that this amounts to attributing the 5.8% difference, or skewing in 
lexical similarities across these two pairs, to borrowing, and concede that this “is 
certainly intuitively appealing.” Nonetheless, alternative explanations are again 
possible: Masai may have retained more common items which Samburu lost, or 
Samburu might have been in contact with a different language, and borrowed 
items from there to replace its previous cognates with Nandi. Wang and Minett 
therefore adopt Hinnebusch’s general concept of skewing, but test it statistically. 

Wang and Minett operate with a straightforward definition of skewing for a 
group of related languages. The skewing between languages A and B is the 
similarity of languages A and B minus the similarity of related languages B and 
C; while the aggregate skewing of A relative to C is the average skewing of A 
and each of the other related languages in the group, compared with C. Wang 
and Minett generated random data for ten related “languages” which have split 
into two subgroups of five, with a homogeneous retention rate set at 90%. The 
distribution of aggregate skewing for all the pairs was calculated over 1,000 runs; 
thereafter, 10% borrowing between one language in one subgroup and one in 
the other was introduced, and the distributions were recalculated. The mean 
aggregate skewing in the condition with no borrowing, and in the borrowing 
condition but for those languages not in contact, was approximately zero; but 
in the borrowing condition, for the languages in contact, the mean was 3.3%, 
significantly above zero. Over a range of further tests, Wang and Minett conclude 
that when we find at least 3.7% aggregate skewing, we can infer contact correctly 
in 42% of cases, though in 6% we will find a skewing over 3.7% without contact. 
This performance may become worse if the retention rate is very heterogeneous, 
or if one donor language is in contact with two recipient languages in the other 
subgroup; conversely, when one recipient language is in contact with two donor 
languages in the other subgroup, detection of contact markedly improves. 

This simulation is very promising, but has two limitations. The first is intrinsic 
to this kind of modeling, which gives a helpful background calculation of the likely 
signals of contact, but crucially needs to be tested against real data across a range 
of contact situations, which may be substantially more complex than the simu- 
lated cases. The second, however, is specific to this situation, as Wang and Minett 
(2005) focus on related languages; and while many instances of borrowing do 
involve related languages, by no means all will. 

Returning to Swadesh lists, another approach to detecting borrowing involves 
assessing whether some meanings might be more susceptible to borrowing than 
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others. Pagel, Atkinson, and Meade (2007) estimated the rate of change for the 
items in the 200-meaning Swadesh list, over 87 Indo-European languages, from 
a database developed by Dyen, Kruskal, and Black (1992), reporting that they 


observe a roughly 100-fold variation in rates of lexical evolution among the mean- 
ings. At the slow end of the distribution, the rates predict zero to one cognate replace- 
ments per 10,000 years for words such as “two,” “who,” “tongue,” “night,” “one 
and “to die.” By comparison, for the faster evolving words such as “dirty,” “to turn,” 
“to stab” and “guts,” we predict up to nine cognate replacements in the same time 
period. 


wu wu ” 


Pagel, Atkinson, and Meade (2007) suggest, on the basis of frequency counts 
for corpora of modern English, Spanish, Russian, and Greek, that the more fre- 
quently a meaning is used, the slower its rate of change has been through the 
history of Indo-European. If this finding turns out to be consistent across language 
families, it would indicate that the words carrying the most retentive meanings 
may not have been replaced for thousands of years, so these might preferentially 
be used in proposing or substantiating long-range, ancient language relationships. 
This also means more frequent meanings might be less readily borrowable, 
though again it must be remembered that borrowing is only one source of 
replacement: a word carrying a given meaning might also be replaced internally, 
by another word within the same language which undergoes semantic change. 

Although Pagel et al. (2007) introduce this connection of rate of replacement 
with frequency, they are not the first to propose that some parts of the Swadesh 
list might be more likely to change, and more susceptible to borrowing, than 
others. McMahon and McMahon (2003), following doctoral research by Marisa 
Lohr, identified two extreme sublists from the 200 Swadesh meanings, referred 
to as the hihi list (where meanings are highly retentive, and highly recon- 
structible), and the lolo list (with less stable, and more borrowable meanings). 
Sublists were compared over the same Dyen et al. (1992) database of Indo- 
European languages and varieties used by Pagel et al. (2007), and trees were drawn 
using only the most conservative, and only the least conservative meanings. These 
trees were typically formally identical in cases where there were no borrowings 
(though branches might be somewhat longer, indicating more change, for the least 
conservative lolo items); however, differences in tree structure corresponded to 
cases where loans had already been identified by e.g. Embleton (1986) or Kessler 
(2001): “for the less conservative meanings, the borrowing language tends to move 
towards the language or group which is the source of the loans” (McMahon et al. 
2005: 149). 

Parallel shifts of position in trees and networks are found for languages set to 
undergo borrowing in the simulations carried out by McMahon and McMahon 
(2005: §4.4.4.), reinforcing the Indo-European findings. However, McMahon 
et al. (2005) apply the same methodology to a comparison of Andean languages 
where genetic relatedness has not been established. This involves asking the 
“Quechumara” question: Do the many similarities between Quechua and the 
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Aymara group indicate common ancestry, or not? As McMahon et al. (2005: 151) 
observe: 


What is not in doubt is the quality or quantity of the correspondences. The *yaca- ~ 
*yaci- [‘to know’, AM] example is just one of many hundreds of equally clear and 
often identical form-to-meaning correspondences that seem to go back all the way 
to Quechua and Aymara proto-forms. Matches so numerous and so close clearly 
exclude chance as an explanation: whether they reflect contact or common origin, 
there is unquestionably some very direct connection between the Quechua and 
Aymara language families. 


The question is how we might try to establish the cause of these resemblances, 
and initially it might seem that we must exclude any version of lexicostatistics, 
computational or otherwise, since this relies on the comparative method: after all, 
if we are meant to be counting cognates in Swadesh lists, we need to know which 
forms are cognate in the first place. Recognizing that calling these similar forms 
cognates rather begs the question, McMahon et al. (2005: 154) take a Kessler-type 
approach, comparing correlates rather than cognates, where “correlate is a cover term 
for any striking form-to-meaning correspondence more convincingly attributable 
to some (unspecified) historical connection than to chance.” Following work 
by Heggarty, they also adopt a more complex version of lexicostatistics, where 
meanings in different languages are scored on a scale reflecting descending 
levels of mutual intelligibility, and the plausibility of correlates is also reflected 
in a score of 0-7. Finally, the Swadesh list is replaced by Heggarty’s 150-meaning 
CALMA list, whose items are selected to be Culturally and Linguistically 
Meaningful for the Andes. This correlate scoring technique for the CALMA list 
provides numerical results which can be compared with those for more conven- 
tional lexicostatistics, though it is important to remember that aspects of the method- 
ology and lists do differ. 

McMahon et al. (2005) establish hihi (much more conservative) and lolo (more 
changeable and susceptible to borrowing) sublists from the basic meaning list; 
for the Andean languages, these overlap substantially with the sublists for Indo- 
European (see (1) below): 


(1) Andean hihi list, 30 items 


one two three four five 

I thou not ear tongue 
tooth _— foot fingernail (claw) heart name 
day night sun star shadow 
wind salt green new come 
eat sleep _live (be alive) give sew 


Bold items are in both the Andean and Indo-European hihi lists; those in the 30- 
item Indo-European hihi list but not the Andean list are: long, other, thin, mother, 
spit, stand. 


Computational Models and Language Contact 137 


(2) Andean lolo list, 30 items 


year left (hand side) face mouth lip 
neck (upper) back skin (human) breast bird 
tail wing man (male adult) river stone 
bread branch grass rope red 
straight _ sick (be ill) far (away) heavy empty 
hot walk swim think push 


Bold items are in both the Andean and Indo-European lolo lists; those in the 
23-item Indo-European lolo list but not the Andean list are: near, smooth, flow, pull, 
throw. 

McMahon et al. (2005) find an average of 6.7% Spanish loans (which are fairly 
easy to identify) in the Andean lolo sublists, but 2.7% in the hihi sublists, which 
suggests that these sublists are differentially prone to borrowing in the same 
way as for Indo-European. Overall, they find 20% distance between the Quechua 
varieties and the Aymara group for the lolo sublist, and 54% distance for the hihi 
items. In other words, these two language groupings are most similar in respect 
of the meanings we independently know to be more susceptible to borrowing. 
This does not prove that contact rather than common ancestry is responsible 
for the similarities between Quechua and Aymara, but it strongly supports that 
hypothesis. Taking advantage of this kind of method, however, requires an 
acceptance of borrowing in the basic vocabulary, and the development of tech- 
niques to detect it: 


if we attempted to remove loans from our data a priori, or took the view that Swadesh- 
type lists are so resistant to borrowing that the presence of loans should not be an 
issue, we could not use sublisting as a technique for discovering likely loans and 
would lose an opportunity to cast light on language histories which cannot be illu- 
minated using more traditional methods of comparison. (McMahon et al. 2005: 168) 


3 Phonetics and Phonology 


While lexical comparison has for some time been fairly common in historical lin- 
guistics (if not wholly uncontroversial), methods for phonetic comparison have 
only made an appearance quite recently. Lexicostatistics may have been damaged 
by its frequent association with glottochronology, the largely discredited attempt 
to assign dates to splits in family trees, but conversely it has partly been validated 
by its links with the comparative method. If comparing Swadesh lists means 
calculating the number of cognates shared by languages, and if cognacy can only 
be established through the comparative method, then traditional lexicostatistics 
at least can only operate if we already have reasonable evidence of language related- 
ness. It is true that phonological resemblances are also central to the compar- 
ative method, since regular correspondences involve both sound and meaning; 
and indeed Ringe et al. include a limited number of phonological characters in 
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their dataset, including P22, syncope of short vowels in final syllables next to *s 
and after semivowels, and P4, lenition of stops after long vowels and unstressed 
vowels (Ringe et al. 2002: 113-16). However, such characters are chosen for their 
utility in establishing the first-order splitting of subfamilies in Indo-European, and 
are therefore necessarily features of that particular language family: thus, P22 is 
present in only Oscan and Umbrian, distinguishing these from the other Indo- 
European languages tested, where it maintains its ancestral, absent state; and 
likewise, P4 is present only in Hittite, Luvian, and Lycian, therefore giving evi- 
dence for the Anatolian subfamily. If we wish to generalize our explorations to 
other families, or to measure the distance between sounds directly, rather than 
between states of preselected characters (see McMahon & McMahon 2005, ch. 8), 
we require different methods. 

An overview of phonetic comparison algorithms is provided in Kessler (2005), 
who catalogues many possible applications for measures of phonetic similarity, 
especially if these are amenable to computational automatization: these include 
identification of speakers in forensic linguistics; comparison of an actual spoken 
utterance to a reference model in speech therapy, child language acquisition or 
second language acquisition; and diagnosis of specific articulatory difficulties. 
Phonetic comparison is currently most frequently encountered in dialectology, where 
it has been applied particularly to German (Goebl 2006) and Dutch (see for 
instance Heeringa & Nerbonne 2001). Heeringa and Nerbonne’s work in particu- 
lar relies on calculation of Levenshtein distances, or string-edit distances, between 
phonetic or phonological transcriptions or representations. The Levenshtein dis- 
tance is the “cost” for the minimum number of insertions, deletions, and substi- 
tutions required to convert one of these strings into the other. As Kessler (2005: 
253) observes, 


The basic version of the Levenshtein measure is binary: it assigns uniform costs 
(1) for all insertions and deletions and for all substitutions that do not involve a pair 
of identical segments; matches of identical segments always have a cost of 0. More 
commonly, people prefer to assign different costs to different insertions, deletions, 
and substitutions. 


This means that the calculation of replacement costs can vary considerably across 
different implementations of Levenshtein distances, as indeed can the rationale 
for assigning those particular costs. Furthermore, the transcriptions over which 
the string-edits are calculated may be much more or much less detailed and 
phonetically narrow; some can be close to phonemic transcriptions, and in the 
extreme case orthographic representations may be used as a proxy (see Ellison & 
Kirby 2006). 

Probably the most pressing questions in the developing area of phonetic com- 
parison at present are whether it is preferable to use off-the-peg comparison 
metrics like Levenshtein distances or purpose-designed ones specifically for phon- 
etics; and how much detail is required in the transcriptions to gain maximum 
value from the results. Both of these are, of course, empirical questions, and both 
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Devon Trad 
London Trad 


London Typ London Emg 
Devon Typ 
Dublin Emg Devon Emg 


Dublin Typ 


Dublin Trad \ 


\ Norwich Emg 
N\ Norwich Typ 
NY I y 
Sheffield Trad Berwick Emg 
Berwick Typ 
Sheffield Typ #<———_, 7 : 
7; SS Berwick Trad 
Middlesbrough Typ Gy NS aS re 
Middlesbrough Emg 4p / = 
SS 
SS 
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Bee ve / Edinburgh 
Sheffield Emg Tyneside Typ Glasgow Typ Trad 
: Glasgow Trad 
Tyneside Emg WC2 Edinburgh Typ 
Tyneside Emg WC Edinburgh Emg 
Tyneside Trad Glasgow Emg 


Figure 6.1 Selection of Traditional, Typical, and Emergent varieties 


are being widely debated. Heggarty, McMahon, and McMahon (2005) propose a 
comparison program based on articulatory phonetics, and McMahon et al. (2007) 
illustrate some results for accents of present-day English. Figure 6.1 shows a 
network diagram for a comparison of detailed phonetic transcriptions of a basic 
vocabulary list of 110 Germanic cognates, for a subset of the 91 modern English 
varieties collected (see www.soundcomparisons.com for further details). For 
most of the locations in Figure 6.1, three subvarieties are compared: the Typical 
working-class pronunciation for each word, the Traditional pronunciation of 
older working-class males, and the Emergent variety spoken by younger speakers 
in their late teens and early twenties. 

While most Emergent subvarieties in Figure 6.1 are close to the Typical sub- 
variety for the same location, there are exceptions. Notably, Sheffield Emergent 
is closer to Tyneside than to the other Sheffield varieties; Devon Emergent is sub- 
stantially closer to London; and both Edinburgh and Glasgow Emergents appear 
to be shifting closer to the English English accents, and away from the other Scottish 
varieties. 

One might argue that the additional phonetic detail included in these investi- 
gations, and the use of a comparison method more complex than Levenshtein 
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distances, is providing us with additional and more nuanced information on accent 
variation and on sound change in progress. However, there are two limitations 
to this research at present. First, we need further analysis to identify the features 
giving rise to the relative closeness between varieties; as it turns out, the shifts of 
Emergent Devon toward London, and of the Scottish Emergents away from the 
other Scottish varieties, owe a great deal to the progressive loss of rhoticity in the 
West Country and in Scotland for younger speakers, but we cannot read this directly 
from the diagrams. Nor can we assume that the same kind of signal in the net- 
work, which we see for instance in the relative closeness of Sheffield Emergent 
to Tyneside, follows from an ongoing change in the same feature. Perhaps more 
importantly, even further analysis of the networks does not allow us to establish 
the cause of similarities between varieties. We can work out that the ongoing 
loss of postvocalic /r/ is partly responsible for the affinities between Devon 
Emergent and London accents, or Glasgow Emergent and northern English 
accents; but not whether this reflects contact (either face-to-face or through the 
media; see Stuart-Smith, Timmins, & Tweedie 2007), or the independent parallel 
development of a common and phonetically natural change. Although in this 
particular case we know rhoticity was the ancestral state, for other features, yet 
another alternative explanation might be retention of the feature in question in 
some varieties, but loss in others. 

It is clear, then, that a great deal more work is needed in determining the pre- 
ferred methods of phonetic comparison and of visualizing the results, to give 
us the best chance of establishing explanations for the patterns we see; at the 
moment, measuring similarity does not tell us how the similarity got there. In 
addition, much of the research on phonetic similarity has involved accents of the 
same language, or closely related languages; if phonetic comparison is to be used 
in attempts to determine whether languages are related or not, there is a long 
way to go in terms of developing flexible but detailed methods of comparison 
across languages. 


4 Morphosyntax 


Morphosyntactic comparison is the least developed area of those considered 
here. It is, however, potentially rather important to extend quantitative methods 
to morphosyntax, not just because of the general desirability of comprehensive- 
ness, but because certain grammatical features have been argued to have a spe- 
cial status in the context of language contact. For instance, Nakhleh, Ringe, and 
Warnow (2005: 386) argue that: 


Borrowing of states between significantly different languages is also a problem, but 
one that has been seriously overestimated in the recent literature. The assumption 
that “anything can be borrowed” from one language into another has been given 
wide currency by Thomason and Kaufman 1988, but many who cite that work 
have paid too little attention to its authors’ clear conviction that the borrowing of 
inflectional morphology is difficult and rare. 
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If borrowing of some morphological and syntactic features is indeed “difficult 
and rare,” it is clearly important to develop databases that include structural as 
well as lexical characters, and specifically morphological as well as phonological 
ones. Nakhleh et al. (2005: 395) explicitly encode this special status of structural 
characters by requiring any successful tree to be compatible with almost all their 
phonological characters (the exceptions being P2 and P3), and eight of their 
morphological characters (namely M3, M5-—6, M8, M12-15). 

However, even if we do accept that “lexical characters ...in general provide 
the least secure evidence for subgrouping: (Ringe et al. 2002: 99), and therefore 
incline towards prioritizing structural and in particular morphosyntactic evi- 
dence in proposing families and subgroups, there are two major questions which 
we are nowhere close to resolving. The first of these is the continued lack of 
consensus on the features which are more or less likely to be borrowed. Nichols 
(1992; this volume) has proposed that certain grammatical features are extremely 
stable; these are consequently not only resistant to contact, but might allow us to 
propose distant relationships among languages or language families; too distant 
to be supported by lexical similarities, which will become unreliable over greater 
time depths. In this category of super-stable features Nichols places deep typo- 
logical signatures like head versus dependent marking, and core argument 
categories. On the other hand, Mithun (in press) argues that some of these very 
features may be found across linguistic areas, for instance in the central 
Northwest Coast of the US. Sometimes there is direct borrowing of linguistic 
material, but in other cases we find semantic or conceptual structure being 
borrowed, and then “clothed” in material from the borrowing language. Even if 
borrowing of morphosyntax is less common, this kind of strategy is likely to make 
it even more difficult to detect. 

The second, and perhaps more serious difficulty in morphosyntactic comparison 
is exactly what we should be measuring. Although he accepts that “Certainly the 
idea that some kinds of grammatical correspondences are probative is correct,” 
Kessler (2001: 102) is overall very concerned about the prospects for such 
comparison: 


To be absolutely clear where my reticence lies, let me reiterate that the trouble with 
using grammatical elements lies essentially in the extreme difficulty of coming up 
in advance with objective lists of grammatical categories that can be objectively matched 
with morphemes in languages of radically different typologies across the world. That 
is, the problem is that they are not as easily amenable to statistical treatment of any 
type as are full meaning-bearing words. 


However, two developments since Kessler’s book was published have meant this 
is no longer a knock-down argument, though it still needs to be addressed. On 
the one hand, as we saw in section 2 above, it has increasingly been recognized 
that basic vocabulary lists are not entirely comparable across languages and 
cultures, but may need to be altered, at least at the margins, to allow both appro- 
priateness to the specific language context, and more general comparability. This 
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means the gap between (allegedly universal) basic vocabulary, and (necessarily 
language-specific) morphosyntax has narrowed considerably. On the other hand, 
ongoing work on establishing typological databases has shown that comparabil- 
ity, and consequently quantitative testability, can be established — though this is 
necessarily a collective enterprise, which requires considerable discussion and 
involvement of experts on a wide range of language families. Wichmann (2008), 
in an excellent overview article, surveys developments in this “emerging field of 
language dynamics,” which combines substantial and ideally global databases with 
simulations. 

As Wichmann (2008) notes, much of the progress in this area is due to the World 
Atlas of Language Structures, which exists in published form (Haspelmath et al. 
2005), but even more crucially is under development as a dynamic online 
system, currently with 142 features (www.wals.info). This WALS database 
contains some phonological features (like tone, or front rounded vowels) and a 
few lexical ones (like “finger” and “hand” polysemy, or the number of basic color 
categories), but for the most part consists of typological morphosyntactic categories 
which may be realized very differently cross-linguistically. These include locus 
of marking in possessive noun phrases; the number of genders, and whether 
gender is sex-based or not; the morphological imperative; and the order of 
Object, Subject, and Verb. 

This development is extremely promising, but it is also currently, and perhaps 
intrinsically, limited. Agreeing the features to be included is a huge step forward, 
but populating the database with full data for all languages included is another, 
and as yet the data are still somewhat sparse: at the time of Wichmann’s (2008: 
2) article, “for 1556 languages less than 20 features are attested and only for 230 
languages are 60 or more features attested. Thus, only a few hundred languages 
may be considered well-attested.” While time and cooperation may resolve this 
problem, the intrinsic issue will remain of how far we can infer phylogeny, or 
language relationship, from typological features of this kind. This is in part an 
empirical question, so that further analysis on the basis of better-populated 
databases is key; but as Wichmann (2008: 7) himself accepts, it is not safe to assume 
that morphosyntactic features are automatically resistant to borrowing: 


A clear result from the inspection of WALS maps and statistical investigations of the 
data they display is that any feature, if it exists in a given area, may diffuse... with 
certain unrelated languages being extremely similar and certain related languages 
quite different. Examples of extreme cases... are the two Niger-Congo languages 
Zulu and Ijo, which share only 28.8% similarities in terms of WALS features and the 
unrelated languages Vietnamese (Austro-Asiatic family) and Thai (Tai-Kadai fam- 
ily), which are 80.9% similar .. . Thus, typological similarity is not a good predictor 
of relatedness. 


This also means, as Wichmann notes, that we must be cautious of claims of related- 
ness which rest on such typological features. Dunn et al. (2005: 2072) suggested 
“the divergence of the Papuan languages from a common ancestral stock” on the 
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basis of such similarities; and while they may well have detected indications of 
ancient relatedness, they also accept, following concerns raised by Donohue and 
Musgrave (2007), that they are “unable to tease ancient contact and phylogeny 
apart” (Dunn et al. 2007: 401). Wichmann and Saunders (2007) develop this 
question of how typological databases can be used to draw historical linguistics 
inferences, focusing on methodological requirements. In particular, Wichmann 
and Saunders return to the relative stability of typological features, the obvious 
conclusion being that we should prioritize the most stable features (if only we 
can agree what those are), perhaps by weighting some characters over others. They 
also suggest that historical linguists will inevitably become more used to estimating 
and interpreting measures of likelihood or confidence when assessing hypo- 
theses of relatedness: “Once this happens, emotionally charged arguments from 
beliefs about what it takes for a given genealogical relationship to be ‘proved’ 
may be avoided and replaced by cooler, statistical reasoning” (2007: 400). Finally, 
Wichmann (2008) turns Ringe et al.’s arguments on the superiority of structural 
characters back on themselves interestingly, for comparisons across families at least, 
by proposing that more reliable indicators of relatedness include both lexical and 
structural data. Indeed, Wichmann (2008: 9) suggests that “A weighting should 
be produced such that information from the lexicon feeds into about three 
fourths of each similarity measure and information from typology accounts for 
one-fourth of the measure.” Clearly the debate on the priority we should accord 
to different features, depending on what we are trying to establish, is by no means 
closed. 


5 Looking Forward 


These are exciting times for the application of quantitative methods to questions 
of language contact; but the challenges are real, and there are central questions 
of methodology and data still to be addressed. In this final section, I shall high- 
light just four of the areas where research is starting to allow us to diagnose and 
analyze signals of language contact, but where there are still important contri- 
butions to be made. 

First, we urgently require more work on comparing results over different 
datasets, both across different levels of the grammar (cf. Ringe et al. 2002; 
Wichmann 2008) but also across different families and geographical areas. If we 
were to discover habitual mismatches between different kinds of data, this could 
help us understand whether contact preferentially affects particular areas of the 
grammar, or even specific constructions or features. In pursuing such compara- 
tive work, we need to move away from simplistic assumptions that, for instance, 
basic vocabulary, or certain grammatical constructions, are never borrowed — these 
assumptions need to be tested and scrutinized. 

Secondly, we need serious development of cross-linguistic databases, to allow 
comparative research to proceed. There are positive indications that this kind of 
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resource building is under way — the World Atlas of Language Structures, and 
collections of Swadesh-type lists like the Greenhill, Blust, & Gray (2003-8) 
Austronesian Basic Vocabulary Database, currently with around 500 lists, are 
encouraging examples. However, we need to remember that from the perspec- 
tive of language contact in particular, there are considerable advantages in 
including both lexical and structural data, and for both of these domains, in extend- 
ing coverage beyond data which are thought to be resistant to borrowing. If we 
want to use comparative, quantitative methods to tell us more about language 
contact and its effects, there is no point in focusing only on those features which 
are most conservative or which preserve signals of relatedness better. 

Thirdly, linguists borrowing models from other disciplines crucially need to 
understand the assumptions underlying these, to help us determine when we 
can apply techniques from elsewhere, and when we have to construct models 
and visualizations specific to language data. Off-the-shelf methods can save 
time, and can also allow easy comparability with data from other domains (such 
as archeology, anthropology, or genetics); but we must be sure that the data and 
the processes of change we find in language do not violate the assumptions of 
the models. This also means we need to focus on comparing methods over the 
same data, to provide independent evidence of what works best and under what 
circumstances; this work has already been begun by Nakhleh, Warnow, Ringe, 
and Evans (2005), and Wichmann and Saunders (2007), for instance. 

Finally, there is a strong case for developing simulations, which are emerging 
as a central method in research on a range of aspects of cultural evolution. Using 
simulations allows us to shift the focus from diagnosing contact where it has already 
happened (which is always going to be a major issue as long as historical lin- 
guistics is primarily concerned with families and classification, and therefore 
contact is seen as “other”), and toward also helping us to predict what might 
happen when languages and populations come into contact. This is a vital ques- 
tion for those interested in language shift, where the issue is not only the impact 
of contact on the structure of a language or languages, but on their differential 
survival. Wichmann (2008) again provides a helpful survey of models being 
employed in this area; he also raises the central question of how detailed simu- 
lations need to be, suggesting that “The models used should have a certain degree 
of realism, but should not try to imitate a complicated reality” (2008: 4). 
However, while it is important to keep the parameters fairly simple to allow them 
to be manipulated and tested, the characteristics of simulations arguably need to 
be close enough to the core ingredients of real-world situations to be seen as valid 
and revealing. In particular, one of the problems of recent modeling contributions 
to research on language shift (see Abrams & Strogatz 2003) is the absence of any 
possibility of bilingualism; agents simply shift from speaking only Language A 
to speaking only Language B. This is, of course, at odds with research on real 
cases of language shift, where bilingualism is a sine qua non for language death; 
and it is heartening to see a bilingualism condition now being incorporated in 
more recent simulations, notably in Kandler and Steele (2008). 
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7 Contact and 
Language Shift 


RAYMOND HICKEY 


1 Introduction 


Among the many contact situations those which involve language shift occupy a 
special position. All language shift scenarios have in common that at the outset 
there is one language and at the end another which is the majority language 
in the community which has experienced the shift. This is true now and must 
also have been in history and prehistory when countless cases of shift occurred. 
Just consider the early Indo-European migrations. Movements of subgroups 
of this family into new geographical locations usually meant that the pre-Indo- 
European populations were “absorbed,” i.e. that they shifted in language (and 
culture) to the branch of Indo-European they were exposed to. This shift may be 
partial or complete, for instance, on the Iberian peninsula it was partial with Basque 
remaining but in the British Isles it was complete. The shift may have lasted into 
history, making the “absorption” more visible, as was the case with Etruscan in 
Italy. Whether the Indo-European branches still show traces of this early contact 
and shift is much disputed.’ But going on shift scenarios today and assuming that 
the same principles of contact applied then as now, one can postulate the 
influence of earlier groups on later groups if the size of the shifting population 
was sufficient for the features of its shift variety to influence the language they 
were shifting to. This is not always the case, however, so a note of caution should 
be struck here. Moving forward to recent history one can see in the anglophone 
world that language shift did not always leave traces of the original language(s). 
The considerable shift of native Americans to English has not affected general forms 
of English in either the USA or Canada. What may occur is that the shift variety 
establishes itself as a form in its own right, focused with a stable speech com- 
munity, cf. South African Indian English (Mesthrie 1992), but even then there is 
usually a further approximation to supraregional forms of English which reduce 
the specific profile of the shift variety, cf. Australian Aboriginal English and Maori 
English in New Zealand. 
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1.1 Motivation for shift 


The motivation for language shift and the circumstances under which it takes place 
will of course vary from case to case but there is sufficient common ground for 
generalizations to be made about language shift and for the analysis of a single 
instance to be of broader value to the study of language shift as a whole. 

For the following discussion the language shift which took place in Ireland, 
roughly between the early seventeenth century and the late nineteenth century, 
will be considered. This is a shift from the original language of the vast majority 
in Ireland, Irish, to English, a language which was imported to Ireland in the late 
twelfth century and which is now (early twenty-first century) the language of over 
99 percent of the Irish population, in both the north and south of the country. 

English has not always been the dominant language in Ireland. Initially, this 
was Anglo-Norman (Cahill 1938) and then Irish so that in the fourteenth and 
fifteenth centuries, English had receded to the east coast and was only found in 
any strength in the towns. But after significant victories for the English at the begin- 
ning of the seventeenth century, after the settlement of the north with Lowland 
Scots and people from northern and north-western England and after the 
Cromwellian campaigns and settlements of the 1640s and early 1650s, the fortunes 
of the Irish language and culture declined and were to wane steadily in the 
centuries between that time and the present. Currently, Irish is spoken natively 
by not more than 50,000 people (probably by many fewer) in three main pockets 
on the western seaboard. The Irish spoken by these individuals is very strongly 
influenced by English and it will be considered at the end of this chapter. But in 
the main, the chapter will deal with the rise of vernacular varieties of Irish 
English during the shift period of the past few centuries. 

The information presented here is intended to highlight key aspects of language 
shift. For reasons of space nothing like a comprehensive treatment can be offered. 
A much more detailed analysis of contact phenomena can be found in the central 
section of Hickey (2007). 


1.2. The nature of the shift 


The shift in Ireland must have involved considerable bilingualism over several 
centuries. The native language for the majority of the population was initially Irish 
and recourse to this was always there. English would have been used in contact 
with English speakers (administrators, bailiffs, or those few urbanites who only 
spoke English). There was also considerable interaction between the planters and 
the native Irish, certainly in the countryside where this group of English speakers 
had settled. Indeed there may be grounds for assuming that a proportion of the 
planters by the mid seventeenth century would have had at least a rudimentary 
knowledge of Irish. They would have been a source of bilingualism for the native 
Irish population, at the interface between themselves and those planters without 
any Irish. However, this source of bilingual interaction should not be over- 
estimated. There would seem to be little evidence for the view that key features 
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of Irish English arose through the interaction with bilingual people of English 
origin. The planters in the south of Ireland’ numbered a few thousand at the most 
while there were several million Irish speakers, probably between seven and eight 
million before the onset of the Great Famine (1845-8). Furthermore, the view that 
the planters were cared for by Irish nurses and had contact with the children of 
the native Irish is supported by authors like Bliss (1976: 557). The ultimate effect 
of this would have been to render the language of the planters more like that of 
the native Irish so that no specific variety of planter English arose. 

The language shift did not progress evenly across the centuries. Major external 
events, chiefly famine and emigration, accelerated the pace. During such setbacks, 
Irish quickly lost ground which it was not to recover. Famine struck throughout 
the eighteenth century, especially in the 1720s and in 1740-1, and emigration from 
Ulster was considerable during this century, though this largely involved settlers 
of Scottish origin who moved to North America. 

The most significant blow to the Irish language was the Great Famine of the 
late 1840s which hit the poorer rural areas of Ireland hardest. The twin factors of 
death and emigration reduced the number of Irish speakers by anything up to 
two million in the course of less than a decade. The famine also brought home 
to the remaining Irish speakers the necessity to switch to English to survive in 
an increasingly English-speaking society and to prepare for possible emigration. 


2 What Can Be Traced to Contact? 


It goes without saying that there is no proof in contact linguistics. If a structure 
in one language is suspected of having arisen through contact with another, then 
a case can be made for contact when there is a good structural match between 
both languages. Take as an example the phrases at the beginning of the follow- 
ing sentences* which have an exact equivalent in Irish: 


(1) a. More is the pity, I suppose. (TRS-D, S42, M) 

Is mor an trua, is déigh liom. 
[is big the pity, is suppose with-me] 

b. Outside of that, I don’t know. (TRS-D, W42-2, F) 
Taobh amuigh de sin, nil a fhios agam. 
[side out of that, not-is know at-me] 

c. There’s a share of jobs alright. (TRS-D, 57, M) 
Ta roinnt jabanna ann, ceart go leor. 
[is share jobs-GEN in-it right enough] 


However, the case for contact as a source, at least as the sole source, is consider- 
ably weakened if the structure in question is attested in older forms of the 
language which has come to show it. There are features in Irish English of this 
type, that is they could have a source either in older forms of English taken to 
Ireland or in Irish through contact. An example of this is provided by doubly marked 
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comparatives. In Irish, comparatives are formed by placing the qualifier nios ‘more’ 
and inflecting the adjective as well. For instance, déanach ‘late’, which consists 
of the stem déan- and the stem-extending suffix -ach, changes to déanai in the 
comparative although the comparative particle nios is used as well: 


(2) Beimid ag teacht nios déanai. ‘We will be coming later.’ 
[will-be-we at coming more later] 


This double marking may have been transferred in the language shift situation. 
But such marking is also typical of earlier forms of English (Barber 1997 [1976]: 
200-1) and may well have been present in input forms of English in Ireland. It 
is still well attested today as in the following examples: 


(3) a. He’s working more harder with the new job. (WER, F50+) 
b. We got there more later than we thought. (DER, M60+) 


In such cases it is impossible to decide what the source is, indeed it is probably 
more sensible to postulate a double source, and to interpret the structure as a case 
of convergence. 


3 The Search for Categorial Equivalence 


Before broaching the details of the case for contact, it is important to consider the 
difference between the presence of a grammatical category in a certain language 
and the exponence of this category. For instance, the category “future” exists in 
the verb systems of both English and Irish but the exponence is different, i.e. 
via an auxiliary will/shall in the first language, but via a verb-stem suffix in the 
second language. This type of distinction is useful when comparing Irish English 
with Irish, for instance when comparing habitual aspect in both languages, as can 
be seen from Table 7.1. 


Table 7.1 Category and exponence in Irish and Irish English 


Category Exponence in Irish English Exponence in Irish 
Habitual (1) do(es) be + V-ing bionn + nonfinite verb form 
They do be fighting a lot. Bionn siad ag troid go minic. 
(2) bees (northern) [is-HAB they at fighting often] 


The lads bees out a lot. 


(3) verbal -s (first person) 
I gets tired of waiting for 
things to change. 
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3.1 Category and exponence 


When shifting to another language, temporarily or permanently, adults expect 
the same grammatical distinctions in the target which they know from their 
native language. To this end they search for equivalents in the target language to 
categories they are already familiar with. This process is an unconscious one and 
persists even with speakers who have considerable target language proficiency. If 
the categories of the outset language are semantically motivated then the search 
to find an equivalent in the target is all the more obvious. Here is an illustration. 
In Irish there is a distinction between the second person singular and plural pro- 
noun but not in standard English. In the genesis of Irish English, speakers would 
seem to have felt the need for this nonexistent distinction in English and three 
solutions to this quandary arose: 


(4) a. the use of available material, yielding you # ye 
(ye available from early English input) 
b. the analogical formation of a plural: you # youse < you + {S} 
(not attested before early to mid nineteenth century) 
c. acombination of both (a) and (b) as in you # yez < ye + {S} 
(not found before mid nineteenth century) 


In all these cases the search for an equivalent category of second person plural 
was solved in English by the manipulation of material already in this language. 
At no stage does the Irish sibh [fv] ‘you-PL’ seem to have been used, in contrast, 
for instance, to the use of West African unu ‘you-PL’ found in Caribbean English 
(Hickey 2003). 

Apart from restructuring elements in the target, speakers can transfer elements 
from their native language. This transfer of grammatical categories is favored, if 
the following conditions apply: 


The target language has a formal means of expressing this category. 

There is little variation in the expression of this category. 

The expression of this category is not homophonous with another one. 

The category marker in the outset language can be identified — is structurally 
transparent — and can be easily extracted from source contexts. 


RON 


Before looking at a case where transfer did actually take place, one where it 
did not is presented as it can be seen that the complete lack of equivalence pre- 
cluded any transfer to English. Irish has a special form of the verb, known as the 
“autonomous,” a finite verb which is not bound to a particular person, i.e. which 
is agentless: 


(5) a. Tathar ann a cheapann go bhfuil an ceart aige. 
[is-AUT in-it that think that is the right at-him] 
‘There are people who think he is right.’ 


156 Raymond Hickey 


b. Bristear an dli go minic. 
[break-AUT the law often] 
‘The law is often broken.’ 

c. Cailleadh anuraidh i. 
[lost-AUT she last-year] 
‘She died last year.’ 

d. Rugadh mac di. 

[born-AUT son to-her] 
‘She bore a son.’ 


Neither in present-day contact English nor in the textual record for Irish English 
is a direct transfer of the autonomous form of Irish attested. Agentless finite verb 
forms are unknown in English and, furthermore, the means of expressing agen- 
tivity in Irish, via a compound form of preposition + pronoun — see example (d) 
above — is not available in English either. Instead the internal means found in 
English, the passive, generic sentences with there, are and were used. 


3.2 Attested cases of shift 


Where transfer is attested it is worth considering just how this may have taken 
place. In a language shift situation, transfer must first occur on an individual level, 
perhaps with several individuals at the same time. But for it to become estab- 
lished, it must be accepted by the community as a whole. If such transfer is to be 
successful, then it must adhere to the principle of economy: it must embody only 
as much change in the target as is necessary for other speakers in the community 
to recognize what native structure it is intended to reflect. 

To illustrate how this process of transfer is imagined to have occurred in the 
historical Irish context, consider the example of the immediate perfective formed 
by the use of the prepositional phrase tar éis ‘after’. 


(6) Ta siad tar éis an obair a dhéanamh. 
[is they after the work COMP do] 
‘They are after doing the work’, i.e. ‘They have completed the work.’ 


The pivotal elements in this construction are listed below; the complementizer a 
is of no semantic significance: 


(7) a. adverbial phrase tar éis ‘after’ 
b. nonfinite verb form déanamh ‘doing’ 
c. direct object obair ‘work’ 


It would appear that the Irish constructed an equivalent to the output structure 
using English syntactic means. Item (a) was translated literally as ‘after’, (b) was 
rendered by the nonfinite V-ing form yielding sentences like They're after doing 
the work. With a translation for tar éis and a corresponding nonfinite form the task 
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of reaching a categorial equivalent would appear to have been completed. 
Importantly, the Irish word order “object + verb” was not carried over into 
English (*They’re after the work doing). 

In putting the case against transfer Harris (1991: 205) argues that the order of 
nonfinite verb form and object is different in Irish and English and hence that 
transfer is unlikely to have been the source. However, the aim in the contact 
situation was to arrive at a construction which was functionally equivalent to that 
in the outset language. A word order such as that in John is after the house selling 
would not only unnecessarily flout the sequence of verb and object in English 
(unnecessary as it would not convey additional information) but also give rise to 
possible confusion with the resultative perfective which in Irish English is realized 
by means of a past participle following its object. 

In the transfer of structure during language shift, it would seem both necessary 
and sufficient to achieve correlates to the key elements in the source structure. 
Another instance of this principle can be seen with the resultative perfective of 
Irish English. 


(8) Ta an obair déanta acu. ‘They have finished the work.’ 
[is the work done at-them] 
IrEng: “They have the work done.’ 


Essential to the semantics of the Irish construction is the order “object + past 
participle.” Consequently, it is this order which is realized in the Irish English 
equivalent. The prepositional pronoun acu ‘at-them’ (or any other similar form) 
plays no role in the formation of the resultative perfective in Irish, but is the means 
to express the semantic subject of the sentence. As this is incidental to the per- 
fective aspect expressed in the sentence, it was neglected in Irish English. 

The immediate perfective with after does not appear to have had any model in 
archaic or regional English (Filppula 1999: 99-107). With the resultative perfec- 
tive, on the other hand, there was previously a formal equivalent, i.e. the word 
order “object + past participle.” However, even if there were instances of this word 
order in the input varieties of English in Ireland this does not mean that these 
are responsible for its continuing existence in Irish English. This word order could 
just as well have disappeared from Irish English as it had in forms of mainland 
English (van der Wurff & Foster 1997). However, the retention in Irish English 
and the use of this word order to express a resultative perfective can in large part 
be accounted for by the wish of Irish learners of English to reach an equivalent 
to the category of resultative perfective which they had in their native language. 

Another issue to consider is whether the structures which were transferred still 
apply in the same sense in which they were used in previous centuries. It would 
be too simplistic to assume that the structures which historically derive from Irish 
by transfer have precisely the same meaning in present-day Irish English. For 
instance, the immediate perfective with after has continued to develop shades of 
meaning not necessarily found in the Irish original as Kallen (1989) has shown 
in his study. 
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4 The Prosody of Transfer 


The case for contact should be considered across all linguistic levels. In particular 
it is beneficial to consider phonological factors when examining syntactic transfer 
(Hickey 1990: 219). If one looks at structures which could be traced to transfer 
from Irish, then one finds in many cases that there is a correspondence between 
the prosodic structures of both languages. To be precise, structures which appear 
to derive from transfer show the same number of feet and the stress falls on 
the same major syntactic category in each language (Hickey 1990: 222). A simple 
example can illustrate this. Here the Irish equivalent is given, although that is not 
of course the immediate source of this actual sentence as the speaker was an English- 
speaking monolingual. 


(9) A...don’t like the new team at all at all. (WER, M55+) 
1 i) ] 

Ni thaitnionn an fhoireann nua le hA .. . ar chor ar bith. 

re eee | 


[not like the team new with A... on turn on anything] 


The repetition of at all at all creates a sentence-final negator which consists of two 
stressed feet with the prosodic structure WSWS (weak-strong weak-strong) as does 
the Irish structure ar chor ar bith. This feature is well established in Irish English 
and can already be found in the early nineteenth century. 

Consider now the stressed reflexives of Irish which are suspected by many authors 
(including Filppula 1999: 77-88) of being the source of the Irish English use of 
an unbound reflexive. 


(10) ,An 'bhfuil sé 'fhéin ,is'tigh in'niu? ‘Is he himself in today?’ 
[INTERROG is he self in today] 
IrEng: “'Is shim'self! in ,to'day?’ 


The strong and weak syllables of each foot are indicated in the Irish sentence 
and its Irish English equivalent above. From this it can be seen that the Irish reflex- 
ive is monosyllabic and, together with the personal pronoun, forms a WS foot: 
sé 'fhéin [he self]. In Irish English the equivalent to this consists of a reflexive 
pronoun on its own: ,him'self, hence the term “unbound reflexive” as no per- 
sonal pronoun is present. If both the personal and reflexive pronoun were used 
in English, one would have a mismatch in prosodic structure: WS in Irish and 
SWS (‘he ,him'self) in Irish English. One can thus postulate that the WS pattern 
of ,him'self was interpreted by speakers during language shift as the prosodic 
equivalent of both the personal pronoun and reflexive pronoun of Irish ,sé 'fhéin 
and thus used as an equivalent of this.° 

Another example of prosodic match can be seen with the immediate perfective 
of Irish English (discussed above) which corresponds, in the number of stressed 
syllables, to its Irish equivalent. 
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(11) a. She’s after breaking the glass. 


Ta st tréis an ghloine a bhriseadh. 

[ 1 | 1 1 ] 
b. He’s after his dinner. 
1 1 ] 


[ 1 1 | 
Ta sé tréis a dhinnéir. 
gs ae 
This consists in both languages of three or two feet depending on whether the 
verb is understood or explicitly mentioned (it is the number of stressed syllables 
which determines the number of feet). In both languages a stressed syllable intro- 
duces the structure and others occur for the same syntactic categories through- 
out the sentence. 
A similar prosodic correspondence can be recognized in a further structure, 
labeled “subordinating and” (Klemola & Filppula 1992), in both Irish and Irish 
English. 


(12) a. He went out ‘and ‘it ‘raining. 
‘He went out although it was raining.’ 
b. Chuaigh sé amach ‘agus 'é ag cur ‘bdisti. 
[went he out and it at putting rain-GEN] 


Again there is a correlation between stressed syllable and major syntactic cat- 
egory, although the total number of syllables in the Irish structure is greater (due 
to the number of weak syllables). The equivalence intonationally is reached by 
having the same number of feet, i.e. stressed syllables, irrespective of the distance 
between them in terms of intervening unstressed syllables. And again, it is a stressed 
syllable which introduces the subordinate clause. 

Another case, where prosodic equivalence can be assumed to have motiv- 
ated a nonstandard feature, concerns comparative clauses. These are normally 
introduced in Irish by two equally stressed words 'nd 'mar ‘than like’ as in the 
following example: 


(13) Ta sé i bhfad nios fearr anois ‘nd ‘mar a bhi. 
[is it further more better now not likeCOMP was] 
‘It’s now much better than it was.’ 


Several speakers from Irish-speaking regions, or those which were so in the recent 
past, show the use of than what to introduce comparative clauses: 


(14) a. It’s far better than what it used to be. (TRS-D, W42-1, F) 
b. To go toa dance that time was far better than what it is now. 
(TRS-D, W42-1, F) 
Life is much easier than what it was. (TRS-D, W42-1, F) 
d. They could tell you more about this country than what we could. 
(TRS-D, $7, M) 


a 
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It is true that Irish mar does not mean ‘what’, but what can introduce clauses in 
other instances and so it was probably regarded as suitable to combine with that 
in cases like those above. From the standpoint of prosody, 'than ‘what provided 
a combination of two equally stressed words which match the similar pair in 
equivalent Irish clauses. 

The use of than what for comparatives was already established in the nineteenth 
century and is attested in many emigrant letters such as those written from Australia 
back to Ireland, e.g. the following in a letter from a Clare person written in 1854: 
I have more of my old Neighbours here along with me than what I thought (Fitzpatrick 
1994: 69). It is also significant that the prosodically similar structure like what is 
attested in the east of Ireland where Irish was replaced by English earliest, e.g. 
There were no hand machines like what you have today (speaker from Lusk, Co. Dublin). 


5 Coincidental Parallels 


Despite the typological differences between Irish and English there are nonethe- 
less a number of unexpected parallels which should not be misinterpreted as the 
result of contact. Some cases are easy, such as the homophony between Irish si 
/fix/ ‘she’ and English she (the result of the vowel shift of /e:/ to /it/ in early 
modern English). A similar homophony exists for Irish bi ‘be’ and English be, though 
again the pronunciation of the latter with /i:/ is due to the raising of English 
long vowels. 

Other instances involve parallel categories, e.g. the continuous forms of verbs in 
both languages: Ta mé ag caint léi [is me at talk-NONFINITE with-her] ‘I am talking 
to her.’ Indeed the parallels among verbal distinctions may have been a trigger 
historically for the development of nonstandard distinctions in Irish English, i.e. 
speakers during the language shift who found equivalents to most of the verbal 
categories from Irish expected to find equivalents to all of these. An example of 
this is habitual aspect, which is realized in Irish by the choice of a different verb 
form (bionn [habitual] versus ta [nonhabitual]): 


(15) Bionn sé ag caint léi. ‘He talks to her repeatedly.’ 
[is-HABITUAL he at talking with-her] 
IrEng: ‘He does be talking to her.’ 


Another coincidental parallel between the two languages involves word order, 
despite the differences in clause alignment which both languages show. In both 
Irish and English prepositions may occur at the end of a clause. A prepositional 
pronoun is the most likely form in Irish because it incorporates a pronoun which 
is missing in English. 


(16) An buachaill a raibh mé ag caint leis. 
[the boy that was I at talking with-him] 
‘The boy I was talking to.’ 
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Further parallels may be due to contact which predates the coming of English to 
Ireland. For example, the use of possessive pronouns in instances of inalienable 
possession is common to both English and Irish: 


(17) Ghortaigh sé a ghluin. 
[injured he his knee] 
‘He injured his knee.’ 


This may well be a feature of Insular Celtic which was adopted into English, 
especially given that other Germanic languages do not necessarily use possessive 
pronouns in such contexts, cf. German Er hat sich am Knie verletzt, lit. ‘He has 
himself at-the knee injured.’ 


6 What Does Not Get Transferred? 


If the expectation of categories in the target language which are present in the 
outset language is a guiding principle in language shift, then it is not surprising 
to find that grammatical distinctions which are only found in the target language 
tend to be neglected by speakers undergoing the shift. 

The reason for this neglect is that speakers tend not to be aware of grammatical 
distinctions which are not present in their native language, at least this is true in 
situations of unguided adult learning of a second language. What is termed here 
“neglect of distinctions” is closely related to the phenomenon of underdifferen- 
tiation which is known from second language teaching. This is the situation in 
which second language learners do not engage in categorial distinctions which 
are present in the target language, for instance when German speakers do not 
distinguish between when and if clauses in English (both take wenn in German). 
This neglect can be illustrated by the use of and as a clause co-ordinator with a 
qualifying or concessive meaning in Irish English (see remarks on subordinating 
and above): 


(18) Chuaigh sé amach agus é ag cur bdistt. 


[went he out and it at putting rain-GEN] 
IrEng: ‘He went out and it raining.’ 
‘He went out although it was raining.’ 


To account for the neglect of distinctions in more detail, one must introduce a 
distinction between features which carry semantic value and those which are 
of a more formal character. Word order is an example of the latter type: Irish is 
a consistently post-specifying language with VSO (verb-subject-object) as the 
canonical word order along with Noun + Adjective, Noun + Genitive for nom- 
inal modifiers. There is virtually no trace of post-specification in Irish English, 
either historically or in present-day contact varieties of English in Ireland. The 
use of the specifically Irish word order would, per se, have had no informational 
value for Irish speakers of English in the language shift situation. 
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Another example, from a different level of language, would be the distinction 
between palatal and nonpalatal consonants in Irish phonology. This difference in 
the articulation of consonants lies at the core of the sound structure of Irish. It 
has no equivalent in English and the grammatical categories in the nominal and 
verbal areas which it is used to indicate are realized quite differently in English 
(by word order, use of prepositions, suffixal inflections, etc.). 

An awareness of the semantic versus formal distinction helps to account for 
other cases of non-transfer from Irish. For instance, phonemes which do not exist 
in English, such as /x/ and /y/, have not been transferred to English, although 
there are words in Irish English, such as taoiseach ‘prime minister’, pronounced 
['ti:fak], with a final [-k] and not [-x], which could have been carriers of the [x] 
sound into English during the language shift period. Although the /k/ versus 
/x/ distinction is semantically relevant in Irish, it would not be so in English and 
hence transfer would not have helped realize any semantic distinctions in the 
target language.° A further conclusion from these considerations is that the 
source of a sound like /x/ in Ireland can only be retention from earlier varieties 
of English. This explains its occurrence in Ulster Scots and in some forms of 
Mid-Ulster English, but also its absence elsewhere, although it is present in all 
dialects of Irish. 


7 Interpreting Vernacular Features 


There has been much discussion of the role of English input and transfer from 
Irish in the genesis of Irish English, most of which has centered around suggested 
sources for vernacular features. Some developments in English ran parallel to the 
structure of Irish and so appeared in Irish English not so much by transfer, which 
implies a mismatch between outset and target language, but simply by equiva- 
lence. One such development of the later modern period (Beal 2004: 77-85) is the 
be + V-ing (progressive) construction as in What are you reading? This would have 
represented an appropriate equivalent to Irish Cedrd ata tu a léamh? [what that-is 
you COMP reading] or Ceard ata a léamh agat? [what that-is at-its reading at-you]. 
Another development is the rise of group verbs (phrasal verbs — transitive and 
intransitive, prepositional verbs and phrasal-prepositional verbs, Denison 1998: 
221). These types of verb occur widely in Irish, e.g. Na bi ag cur isteach orthu [not 
be at put in on-them] ‘Don’t be disturbing them.’ Indeed calques on the English 
phrasal and prepositional verbs are a major source of loans from English into Irish 
today (see below). 

The cases just cited represent instances of convergence, i.e. developments in 
two languages which result in their becoming increasingly similar structurally. 
Convergence can be understood in another sense which is relevant to the genesis 
of specific features of Irish English. This is where both English input and transfer 
from Irish have contributed to the rise of a feature. This applies to the do(es) be 
habitual where English input provided periphrastic do and Irish the semantics of 
the structure and its co-occurrence with the expanded (-ing) form. 
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Table 7.2. Convergence scenarios in the history of Irish English 


la Source 1 (English) independent of Source 2 (Irish) 
English development Irish 
What are you reading? Céard ata tu a léamh? 
[what that-is you COMP reading] 
Outcome in Irish English: Continuous verb phrases maintained 


1b Source 1 (English) independent of Source 2 (Irish) 
older English input Irish 
Are ye ready? An bhfuil sibh réidh? 
[INTERROG are you-PL ready] 
Outcome in Irish English: Distinct second person plural pronoun 
maintained 


2 Source 1 (English) provides form and Source 2 (Irish) semantics 


English input Irish 
(periphrastic /emphatic) Bionn sé amuigh ar an bhfarraige. 
He does live in the west. [is-HABITUAL he out on the sea] 


Outcome in Irish English: Habitual is established, He does be out on the sea. 


3 Failed convergence: 
Source 1 (dialectal English) shares feature with Source 2 (Irish) 
English input Irish 
They were a-singing. Bhi siad ag canadh. 
[were they at singing] 
Outcome in Irish English: A-prefixing does not establish itself 


Mention should also be made of features which exist in Irish and in nonstan- 
dard varieties of English in England, but not, curiously, in Irish English. The best 
example of this is a-prefixing (see (3) in Table 7.2). This is recorded for south- 
western British English, e.g. I be a-singing (Elworthy 1877: 52-3, West Somerset). 
Such structures look deceptively Irish: the sentence could be translated directly 
as Bim ag canadh [is-HABITUAL-I at singing]. However, a-prefixing does not occur 
in modern Irish English and is not attested in the textual record of the past few 
centuries to any significant extent. Montgomery (2000) is rightly sceptical of a 
possible Celtic origin of this feature, contra Dietrich (1981) and Majewicz (1984) 
who view the transfer interpretation favorably. 


8 The Influence of English on Contemporary Irish 


The discussion thus far has concerned transfer from Irish to English during the 
formative period of Irish English. In present-day Ireland the Irish language has 
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no influence on English but the reverse is very much the case. There are virtually 
no monoglots of Irish left, except perhaps for very few rural speakers of tradi- 
tional dialect in the three remaining pockets of historically continuous Irish in the 
north-west, west, and south-west of Ireland. The remainder of the native speakers 
are good bilinguals with a command of English almost indistinguishable from 
English speakers in Ireland. 

Given that the social position of Irish is precarious, despite official support from 
the government and many additional institutions, the pressure of English on the 
language is very considerable. After all one has on the one hand a rural language 
spoken by less than 50,000 and on the other the dominant world language. This 
situation means that English exercises a significant influence on the structure of 
Irish, something which set in during the nineteenth century (Stenson 1993). 

For native speakers the influence is not so much felt in phonology or in mor- 
phology, given the considerable differences between the two languages on these 
levels. Furthermore, the lexis of Irish has many loans from English which go back 
to the late Middle Ages (Hickey 1997) and which have been adapted to Irish. The 
lexical influence of English is obvious in code-switching (Stenson 1991), i.e. the 
direct use of English words in Irish sentences, and in obvious calques: 


(19) a. Nil muid an-happy faoi. 
[not-is we very-happy under-it] 
‘We are not very happy about it.’ 
b. Croi-bhriste a bhi si mar gheall ar an toradh. 
[heart-broken COMP was she on account of the result] 
‘She was heart-broken over the result.’ 


Pragmatic markers, such as well, just, now, really, are commonly inserted into Irish 
sentences, either at the beginning or end or at a clause break: 


(20) a. Ta sé nios diocra, just, na mar a cheap mé. (CCE-W, M65+) 
[is it more difficult just than what COMP thought I] 
‘It’s more difficult, just, than I thought.’ 
b. Well, ta mé ag suil leis an earrach now. (CCE-S, M60+) 
[well is I at looking with the spring now] 
‘Well, I am looking forward to spring now.’ 


In syntax, the influence of English is strongest, despite the typological differences 
between the two languages. There are certain structural parallels between Irish 
and English which facilitate the transfer of English patterns into Irish. This has 
been registered for some time by Irish scholars, at least since the mid twentieth 
century, cf. O Cuiv (1951: 54-5). 

English phrasal verbs and verbs with prepositional complements are particu- 
larly common in Irish (Stenson 1997, Veselinovié 2006). Often they are translated 
(21a) or they are integrated into Irish by having the productive verb-forming end- 
ing -dil attached (21c): 
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(21) a. Bhi si déanta suas mar cailleach. (CCE-W, F55+) 
[was she done up as a witch] 
‘She was done up as a witch.’ 
b. Sheasfadh sé amach i d’intinn. (CCE-W, F55+) 
[stand-COND it out in your-mind] 
‘It would stand out in your mind.’ 
c. Na bi ag rushdil back amdireach. (CCE-W, F55+) 
[not be at rushing back tomorrow] 
‘Don’t be rushing back tomorrow.’ 


Typical word order in Irish has changed in some cases under the influence of 
English. Previously, it was normal to find adverbials in phrase- and sentence-final 
position (indicated in parentheses below). Under the influence of English, 
adverbs (underlined below) are drawn closer to the elements they modify. This 
can be a verb (22a) or a predicative adjective (22b): 


(22) a. Ni fhaca mé riamh rud mar sin (riamh). 
[not saw I ever thing like that (ever)] 
‘T never saw anything like that.’ 
b. Ta a seanathair fos beo (fos). 
[is her grand-father still alive (still)] 
‘Her grandfather is still alive.’ 


This pattern also applies to the order of verb objects. Direct objects previously 
occurred after prepositional objects in final position, but it is increasingly com- 
mon to find the order typical of English, namely direct object + prepositional object: 


(23) Chonaic mé i thios ar an tra [i]. 
[saw I her down on the strand (her)] 
‘IT saw her down on the strand.’ 


A further instance is the position of interrogative elements which can occur 
word-finally in English and are found more and more often in this position in 
Irish: 


(24) a. Chun ceann nua a dhéanamh, no céard? 
[in-order-to one new COMP do_ or what] 
‘In order to do a new one, or what?’ 
b. Cludédh sé céard? 
[cover-COND it what] 
‘It would cover what?’ 


The readiness of Irish to adopt syntactic patterns of English is also seen in direct 
translations of English idioms. These are usually translated word for word, 
something which is possible in quite a number of cases: 
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(25) a. Thog sé tamall fada, ceart go leor. 

[took it time long right enough] 
‘It took a long time sure enough.’ 

b. Bhi orm suil a choinnedil ar an am. 
[was on-me eye COMP keep on the time] 
‘Thad to keep an eye on the time.’ 

c. Caithfidh tu d’intinn a dhéanamh suas. 
[must you your-mind COMP make up] 
‘You have to make your mind up.’ 


Sentence-initial absolute constructions are also increasingly common, introduc- 
ing a type of sentence which is more typical of English than of traditional Irish: 


(26) a. Ag fanacht san iarthar, dirt an t-aire stdit inné go... 
‘Staying in the west, the minister of state said yesterday that...’ 
b. Le bheith firinneach, nil ach droch-sheans ann. 
‘To be truthful, there is only a slight chance of it.’ 


The examples just discussed show how permeable the syntax of Irish is, despite 
the obvious typological differences between Irish and English (VSO word order, 
post-modification). Such examples have occurred between the two languages 
through contact, not through shift (they are found with speakers who continue 
to use their native Irish). This situation of contact with the super-dominant 
language English has meant that Irish has been influenced in many subtle, 
infiltrating ways as well. Speakers establish lexical equivalences between Irish and 
English and this can lead to the English range and application of words spread- 
ing into Irish. A good example is the verb faigh ‘get’. The Irish word corresponds 
to the English word in its meaning of ‘acquire’ (27a). But it is increasingly being 
used in the inchoative sense of ‘get’ which is ousting the Irish verb éirigh ‘rise’ 
which is traditionally used in this sense: 


(27) a. Fuair si bronntanas 6na mathair. 
[got she present from-her mother] 
‘She got a present from her mother.’ 
b. Ta sé ag fail nios fuaire anois. (modern) 
Ta sé ag éirt nios fuaire anois. (traditional) 
[is it at getting/rising more colder now] 
‘It is getting colder now.’ 


9 Conclusion 


The data considered in this chapter shows how syntactic material can be trans- 
ferred from one language to another. For the language shift scenario it further- 
more shows how unguided second language acquisition means that a search for 
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categories in the new language which speakers know from their first language 
is a dominant feature of the shift process. Where the equivalents reached by 
individuals or small groups are accepted by the community they can establish 
themselves as focused features of a later shift variety. The situation of Irish and 
English today shows that, under pressure, a language can be infiltrated syntactic- 
ally by the dominant language. In aggregate, these external influences can lead 
fairly rapidly to typological change and thus illustrates a scenario in which 
language convergence can take place. 


NOTES 


a 


See Vennemann (this volume) for relevant comments. 

This figure excludes the many foreigners who are now living in Ireland, immigrants 
to the country from the 1990s and early 2000s. In practical terms, one can say that the 
99 percent referred to consists of those people who are Irish and whose parents were 
Irish as well. 

The situation in the north of Ireland was quite different because there Scottish and north- 
ern English settlers had come in considerable numbers during the seventeenth century. 
The sample sentences provided in this chapter stem from various data collections of 
the author, both for Irish and for English. These are the following: CCE = A Collection 
of Contact English, DER = Dublin English Recordings, WER = Waterford English Recordings. 
In addition there are a few other abbreviations: M = male, F = female. Before a num- 
ber, W = West, S = South. TRS-D stands for Tape Recorded Survey of Hiberno-English Speech 
— Digital. This collection is based on recordings made by colleagues in the Department 
of English, Queen’s University, Belfast. 

Later, a distinct semanticization of this usage arose whereby the unbound reflexive came 
to refer to someone who is in charge, the head of a group or of the house, etc. 

These remarks refer to language shift. Of course, in a borrowing situation, a sound 
can enter a language with word(s) which show it, e.g. nasal vowels in German which 
are contained in French loans. Furthermore, new phonotactic combinations may enter 
with loanwords, e.g. [f] before an obstruent or nasal in words of Yiddish origin in English, 
cf. schmooze, schmuck. 
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8 Contact and Borrowing 


DONALD WINFORD 


1 Defining Borrowing 


Researchers who study language contact generally distinguish between two 
broad categories of contact-induced changes — those due to borrowing, and those 
due to what has variously been called “interference,” “transfer,” “substratum 
influence,” and so on. Borrowing is usually associated with situations of language 
maintenance, and has been defined as “the incorporation of foreign features into 
a group’s native language by speakers of that language” (Thomason & Kaufman 
1988: 37). On the other hand, “interference” is usually associated with situations 
of second language acquisition and language shift, and is described as the 
influence of an L1 or other primary language on an L2. 

But there is by no means any clear consensus on how borrowing should 
be defined, or how it can be distinguished from interference. For instance, 
Aikhenvald (2002), following Trask (2000: 44), defines borrowing as “the trans- 
fer of features of any kind from one language to another as the result of contact.” 
And both scholars define interference as “the non-deliberate carrying over of 
linguistic features from one’s first language into one’s second language,” noting 
that it is mainly applicable to second language acquisition. This implies that 
interference is a subtype of borrowing. By contrast, Heine and Kuteva (2005: 6) 
define borrowing more narrowly as “contact-induced transfer involving phonetic 
substance of some kind or another” (that is, forms or form—meaning units), and 
distinguish it from the transfer of meanings (including grammatical meanings 
or functions) and syntactic relations. This implies that the transfer of structural 
patterns cannot be considered a case of borrowing. Thomason and Kaufman 
present a diametrically opposed view, claiming not only that structure can be 
borrowed, but indeed that it can lead, in extreme cases, to typological change in 
the borrowing language. Clearly there is need to reconcile these different views 
and achieve a more precise explanation of borrowing as a transfer type.’ 

A much clearer classificatory framework for contact-induced changes was 
introduced by van Coetsem (1988, 2000), which makes a distinction between 
two types of cross-linguistic influence or “transfer types,” namely, borrowing and 
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imposition. The latter is more or less equivalent to terms like “interference via 
shift” and “substratum influence.” In this framework, “transfer” is used in a neutral 
sense to refer to any kind of cross-linguistic influence. In both types of transfer, 
there is a source language (SL) and a recipient language (RL) — a distinction which 
is more consistent than others that have been used in the literature, such as donor 
versus replica language, superstrate versus substrate language, and the like. The 
direction of transfer is always from an SL to an RL, and the agent of the transfer 
is either the RL speaker or the SL speaker. In the former case we have borrow- 
ing (RL agentivity), in the latter, imposition (SL agentivity). Van Coetsem defines 
borrowing as follows: 


If the recipient language speaker is the agent, as in the case of an English speaker 
using French words while speaking English, the transfer of material (and this 
naturally includes structure) from the source language to the recipient language is 
borrowing (recipient language agentivity). (van Coetsem 1988: 3, italics in original) 


In imposition, on the other hand, “the source language speaker is the agent, as 
in the case of a French speaker using his French articulatory habits while speak- 
ing English” (van Coetsem 1988: 3). 

The distinction between borrowing and imposition is based, crucially, on the 
psycholinguistic notion of linguistic dominance. As van Coetsem (1995: 70) explains, 
“A bilingual speaker ...is linguistically dominant in the language in which he 
is most proficient and most fluent (which is not necessarily his first or native 
language).” In borrowing, materials from a nondominant SL are imported into an 
RL via the agency of RL-dominant speakers. Borrowing in these cases typically 
involves vocabulary, though some degree of structural borrowing is also possible, 
as we will see. In imposition, the SL is the dominant (usually the first or primary) 
language of the speaker, who transfers features of the SL into an RL in which 
the speaker is less proficient, for instance an L2 he is attempting to learn (see Winford 
2005 and Smits 1998 for fuller discussion of the distinction between borrowing 
and imposition). 

Linguistic dominance must be distinguished from social dominance, which refers 
to the social or political status of one of the languages. It is important to note that 
the socially dominant language may or may not be the linguistically dominant 
language of the speaker. This means that both borrowing and imposition can take 
place from a socially dominant to a socially subordinate language, and vice 
versa.” Moreover, dominance relationships may change over time both in the 
individual speaker, and in the community. As we will see, the failure to distin- 
guish between social and linguistic dominance has led to misunderstanding of 
the nature of contact-induced change. Distinguishing carefully betwen social and 
linguistic dominance is preferable to traditional classifications because it avoids 
the indeterminacy that characterized them, and provides more insight on the kinds 
of process involved in contact-induced change. 

As far as borrowing is concerned, the RL speaker is the agent of the transfer, 
as in the case of a Japanese speaker who uses English-derived words while 
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speaking Japanese. The RL in these cases is always the linguistically dominant 
language of the speaker, which is not necessarily his first or native language. 
Borrowing, then, can be defined as the transfer of linguistic materials from an 
SL into an RL via the agency of speakers for whom the latter is the linguistically 
dominant language, in other words, via RL agentivity. The nature and extent of 
borrowing also depend on another crucial factor, which van Coetsem (1988: 25) 
calls the “stability gradient of language.” This refers to the fact that certain 
domains or components of linguistic structure tend to be more stable and hence 
resistant to change than others. More stable domains include phonology, mor- 
phology (particularly inflectional paradigms), and aspects of syntax and semantics. 
Lexicon and certain areas of structure such as derivational morphology, free func- 
tion morphemes, and some aspects of syntax, are less stable, and hence more 
amenable to change. In general, borrowing tends to involve the lexicon and other 
less stable domains, and does not usually have a significant impact on the RL 
grammar. The extent to which structural borrowing can occur is a matter of some 
controversy. 

To summarize, borrowing can be defined as the transfer of linguistic materials 
from an SL into an RL via the agency of speakers for whom the latter is the 
linguistically dominant language, in other words, via RL agentivity. This conception 
of borrowing is fundamentally different from that used by previous researchers. 
Essentially, it refers to a psycholinguistic mechanism by which speakers intro- 
duce materials from an external language into a language in which they are (more) 
proficient. In doing so, they tend to preserve the more stable domains of the RL. 
This is not to say that borrowing never involves structural elements or abstract 
grammatical features, only that such structural borrowing occurs only under specific 
circumstances, which will be discussed further below. 


2 Lexical Borrowing 


A great deal of attention has been devoted to lexical borrowing and to classi- 
fication of its products. As early as the nineteenth century, scholars like Paul 
(1886), and later Seiler (1907-13) were attempting to classify lexical borrowings. 
Later, Betz (1949) formulated a distinction between Lehnwort ‘loanword’ and 
Lehnpriigung ‘loan coinage’, which still forms a basis for current classifications. 
Haugen (1950, 1953) further refined this classification by distinguishing three 
broad categories of lexical borrowings — loanwords, loan meanings, and creations. 
Loanwords involve imitation of the phonological shape and meaning of some 
lexical item in the SL, and include pure loanwords, e.g., Spanish burrito borrowed 
into English, and loan blends, e.g., Pennsylvania German bassig ‘bossy’ (< English 
boss + German —ig). Loan meanings or loan shifts are of two types. The first involves 
changes in the semantics of an RL word under influence from an SL word, e.g., 
American Portuguese frio extended its meaning from ‘cold temperature’ to ‘cold 
infection’ under influence from English cold. The second type includes loan trans- 
lations or calques, in which a foreign word formation model is replicated by native 
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words, e.g., German Wolkenkratzer for ‘skyscraper’. Creations, finally, involve the 
innovative use of native expressions to convey a foreign concept, as when the Pima 
created the phrase ‘wrinkled buttocks’ to refer to an elephant. Other innovations 
of this type include blends of native and foreign morphemes to express a newly 
acquired foreign concept, e.g. Yaqui lios-ndoka (< Spanish dios ‘God’ + Yaqui ndoka 
‘speak’) to convey the sense of ‘pray’. 

One of Haugen’s important contributions was to recognize that classifications 
such as these tell us little about the actual linguistic processes involved in lexical 
borrowing. Rather, as he noted, “most of the terms used in discussing [borrowing] 
are ordinarily descriptive of its results rather than of the process itself” (1950: 213). 
Haugen attempted to capture some insight into the process itself by identifying 
two aspects of borrowing, which he termed importation and substitution. 
Importation refers to the adoption of a foreign form and/or meaning, while 
substitution refers to the process by which RL sounds or morphemes are sub- 
stituted for those in the SL. As Haugen notes, every loan is part importation 
and part substitution. This insight led scholars to focus more on the linguistic 
processes underlying borrowing, and particularly on processes of adaptation and 
integration of borrowings into the RL. Haugen’s insights were further developed 
by van Coetsem, who couched the distinction between adoption and substitution 
in more psycholinguistic terms, replacing it with a distinction between imitation 
and adaptation as the two mechanisms involved not just in borrowing, but in 
contact-induced change in general (1988: 8-12). In the case of borrowing, adap- 
tation in most cases usually involves only an adjustment to the native RL, which 
does not modify that language. This is a key diagnostic for borrowing, though it 
does not totally exclude the possibility of some modification of the RL due to this 
transfer type. 


3 Integration of Loanwords 


Lexical borrowings are usually adapted to the phonology and morphology of the 
RL, and eventually become indistinguishable from native items. Thus, English loan- 
words in Japanese are adapted to Japanese phonetics and phonotactics, particu- 
larly its preferred CV syllable structure, through various processes, including 
epenthesis (e.g. baseball > besuboru), syllabification of glides (e.g. quiz > kuizu), and 
cluster simplification (e.g. sweater > seta). The process of adaptation can be even 
more complex, and foreign loans can be subjected to a variety of other processes 
of change. For instance, adaptation of English loans to Japanese has resulted 
in truncated compounds like pokemon (< pocket monster); semantic shifts like 
handoru ‘driving wheel of a car’ (< handle); and blends like dai-sutoraiku 
< Japanese ‘big’ + strike). 

Loanwords also have to be adapted to the syntax and morphology of the RL, 
particularly if it has rules involving categories like case, number, gender, agree- 
ment, and the like. In general, loanwords pose few problems for adaptation to the 
syntax of the RL, assuming they share the syntactic behavior of items belonging 
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to similar lexical categories in the RL. But sometimes even such assignment may 
be problematic. In (standard) Swahili, for example, nouns fall into 15 morpho- 
logically defined subclasses, each with its own pair of singular and plural 
suffixes, some of which are covert (Mkude 1986: 519). Differences in class member- 
ship are signaled by agreement markers appearing on demonstratives and other 
word classes that have to agree with the noun in question. Foreign loans there- 
fore have to be assigned to a noun class on the basis of some criterion. Sometimes 
this is done simply on the basis of a formal similarity to native stems. Thus, Arabic 
kitab ‘book’ has been reanalyzed as ki-tabu and assigned to class 7/8, with the 
plural vi-tabu. The more usual strategy is to place foreign loans into classes 5/6 
(with singular and plural prefixes O and ma- respectively), and especially classes 
9/10, which lack overt class prefixes (Mkude 1986). Another interesting example 
of morphological adaptation is the way borrowed nouns and adjectives are 
assigned grammatical gender in languages like Dutch, German, French, etc. Such 
assignment may depend on various factors, including formal criteria (similarity 
in phonological shape), meaning, and analogy. For instance, French nouns end- 
ing in -ment (gouvernment ‘government’, appartement ‘apartment’, etc.) generally 
receive neuter gender when borrowed into Brussels Dutch, while loans ending 
in -iteit (e.g. variabiliteit ‘variability’) are assigned feminine gender, on the basis 
of the suffix. On the other hand, “natural” gender may be used as the criterion 
for gender assignment. Thus French nouns which refer to males (agent ‘agent’, 
facteur ‘postman’, etc.) receive masculine gender when borrowed into Brussels Dutch, 
while nouns referring to females (danseuse ‘female dancer’, madame ‘madam’) are 
assigned feminine gender (Treffers-Daller 1994: 130). In other cases, analogy 
determines gender assignment, as when for example the English loanword Stress 
is assigned masculine gender in German by analogy with nouns like Kampf 
‘struggle’ which are semantically similar (though the status of a word as a mono- 
syllable ending in a consonant would further favor the assignment of masculine 
gender, Hickey 1999). 

Other processes of creative adaptation can also come into play as part of the 
integration of loan items into the morphological structure of the RL. Japanese for 
example treats English loans as stems that can be converted to other classes 
by the addition of suffixes or a helping verb (Loveday 1996: 118). For example, 
borrowed nouns may be converted into adjectives (or adjectival nouns) by 
attaching the suffix -na (e.g. romanchikku-na ‘romantic’) or into adverbs via 
affixation of -ni (e.g. romanchikku-ni ‘romantically’). Borrowed nouns may also be 
converted for use as verbs by adding the dummy verb suru ‘do, make’, e.g. sain 
suru ‘sign’, enjoi suru ‘enjoy’, etc. These strategies conform fully to Japanese 
patterns of derivation. Even the “clipping” of loan items common in Japanese (e.g. 
han-suto < hanga-sutoraiki < hunger strike) is a way of making such importations 
conform more closely to native Japanese morphophonology (Loveday 1996: 118). 

We have seen that the results of the process of borrowing are quite varied. Some 
are close imitations of foreign items (e.g. rendezvous borrowed from French into 
English). Others change drastically in shape (e.g. Costan Rican Spanish chinchibi 
from English gingerbeer), while still others are inventions that employ only RL 
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materials in imitation of some foreign pattern (e.g. Spanish rascacielos modeled 
on English skyscraper). In fact, many borrowings do not represent complete 
adoption of a foreign item with both its form and meaning intact. Some borrow- 
ings consist of foreign forms that have been assigned new meanings (e.g. 
Japanese sumato ‘slim, slender’ < English smart). Others involve the adoption of 
a foreign meaning or concept to which a native lexeme is attached (e.g. Japanese 
sara, extended to include Western-style ‘plate’). Also, many of the outcomes of 
lexical borrowing involve innovations that have no counterpart in the SL, some 
of which may be created out of foreign materials (e.g. Japanese wan-man-ka ‘bus 
without a conductor’ < English one + man + car), while others may be created out 
of native materials, for example Zapotec éxxuwi ‘fig’ < exxu ‘avocado’ + wi 
‘guava’. Still other results of borrowing are blends of native and foreign items 
(e.g. Yaqui liosnéoka ‘pray’ < Spanish dios ‘God’ + Yaqui néoka ‘speak’, see above). 
It would appear that the composition of lexical entries can be manipulated and 
rearranged in a variety of ways to produce different types of lexical borrowings. 
The various types of integration we have examined here demonstrate that bor- 
rowing involves complex patterns of lexical change that create new lexical entries 
or modify existing ones. In all cases, borrowed items are manipulated so that they 
conform to the structural and semantic rules of the RL. This is the hallmark of 
borrowing under RL agentivity. 


4 Borrowing of Structural Elements 


It seems uncontroversial that overt structural elements, both phonological and mor- 
phological, can be transferred from one language into another. But there appear 
to be strict limits on what can be transferred, and under what conditions. In most 
cases, such transfer is mediated by lexical borrowing, in other words, structural 
elements come along with lexical borrowings, and may end up becoming part of 
the RL system. With regard to phonology, for instance, Twana acquired voiced 
stop and affricate phonemes via lexical borrowing from Lushootseed, another 
Salishan language (Thomason & Kaufman 1988: 81). Additionally, phonological 
changes that are promoted by lexical borrowing may involve the phonemiciza- 
tion of phonetic distinctions that already existed in the RL. For instance, distinc- 
tions between /s/ and /z/, /f/ and /v/, and /6/ and /6/, emerged in Middle 
English as a result of the heavy borrowing of French words containing the voiced 
members of the respective pairs. No new sounds were really introduced. Even in 
situations where significant phonological change has occurred in an RL under 
contact, we find little evidence of direct transfer of phonemes. For instance, 
Aikhenvald documents a wide variety of phonological innovations in Tariana under 
East Tucanoan influence, but only one or two of these involve change in the phon- 
emic inventory, such as the creation of a contrast between /p/ and /b/. But even 
this and other changes such as the increased use of /o/ and the rise of a Vh syl- 
lable pattern are due to lexical borrowing. Aikhenvald notes that these changes 
result in “a simple increase of frequency of otherwise rare sounds, and [do] not 
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result in the introduction of new phonemes, or new syllable patterns” (2002: 53). 
All of this suggests that borrowing of phonological elements is extremely rare, 
and, when it occurs, it tends to be mediated by lexical borrowing 

At the morphological level too, there is ample evidence that functional or gram- 
matical elements can be borrowed. But some types lend themselves more readily 
to transfer than others. For instance, it is well known that free functional elements 
such as conjunctions, prepositions, pronouns, and even complementizers can be 
borrowed directly. For example, Native American languages have borrowed con- 
junctions like pero ‘but’, and subordinators like porque ‘because’ from Spanish. Such 
borrowings are akin in some respects to lexical borrowings, though their status as 
closed-class items makes them less amenable to transfer. It has been proposed 
that there is a hierarchy of borrowing according to which open-class items like nouns 
and adjectives lend themselves most readily to borrowing, while closed-class items 
are more resistant to it. Muysken (1981) proposed the following hierarchy. 


(1) nouns > adjectives > verbs > prepositions > coordinating conjunctions > 
quantifiers > determiners > free pronouns > clitic pronouns > subordinating 
conjunctions 


Though the pattern is not exactly the same in all situations, the general outline 
of this hierarchy has been confirmed in various studies. 

There is also evidence that certain kinds of bound morphology can be borrowed, 
but again there appear to be strict limits on the extent to which this can occur. 
Bound derivational morphemes are quite often introduced to an RL along with 
lexical borrowings, and can become productive if the borrowed words are 
numerous. For example, the massive importation of French words into Middle 
English introduced various derivational affixes that were eventually extended to 
use with native stems. Borrowings like conspir-acie, charit-able and others yielded 
suffixes like -acy and -able, while en-rich, dis-connect, etc. yielded new prefixes. Yet, 
despite the rich contribution of French, its overall impact on English morphology 
was not great.’ French loans were for the most part adapted to native word- 
formation processes, in keeping with what we would expect in a situation of RL 
agentivity. 

Borrowing of overt inflectional morphology appears to be very limited, though 
it has been documented. Once more, lexical borrowing can be the channel for such 
transfer, as exemplified in the introduction of the plural inflection -im into 
Yiddish via borrowed Hebrew pairs such as min/minim ‘sort.’ This is similar to 
the introduction of Latin and Greek plural inflections into English via pairs like 
focus/foci, phenomenon/phenomena, and the like. But, as Hickey (p.c.) points out, these 
were originally scholarly borrowings into the written language, which spread to 
the spoken language through education. Since these inflections were confined to 
a few learned words and used primarily by the educated, this might explain why 
they never became productive in English, unlike the new Yiddish plural -im, which 
was later generalized to other words. Borrowing of inflectional morphemes can 
also occur in dialect contact or other situations of close typological fit between 
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languages, where substitution of one inflection for another is facilitated. For 
instance, Meglenite Rumanian speakers replaced their native verb inflections -u 
and -i with Bulgarian inflections /-um/and /-i3/ respectively (Weinreich 1953: 51). 
In some cases, earlier borrowing of lexical patterns may create conditions favor- 
able to later borrowing of bound morphemes. For instance, the calquing of verb 
compounding patterns in Tariana on the model of East Tucanoan languages 
created a new slot in Tariana verbal structure into which East Tucanoan gram- 
matical elements could be incorporated (Aikhenvald 2002: 141). Again, such 
direct transfer was the exception rather than the rule in this situation, despite the 
extensive influence on East Tucanoan on many aspects of Tariana grammar. We 
have to conclude, therefore, that the direct borrowing of structural elements or 
what Croft (2003: 51) calls “substance linguemes” is indeed a rare occurrence in 
most situations of language contact. 


5 Constraints on Borrowing of Overt Elements 


The borrowing of overt forms, whether lexical or morphological, is subject to both 
social and linguistic constraints. Socially based motivations for lexical borrowing 
are usually associated with need and prestige. As Weinreich (1953: 56) noted, 
the “need to designate new things, persons, places, and concepts” is a compelling 
reason to borrow lexical items. Both socially dominant and subordinate languages 
borrow from one another for this reason. For example, the contact between English 
and various Native American languages during the colonial period led to import- 
ation of words like skunk, moccasin, etc. from Algonquian languages into American 
English. Similarly, Australian English adopted words like kangaroo, billabong, etc. 
from Aboriginal languages. Borrowing is especially common where there is need 
to keep abreast of developments in science, technology, and higher learning 
generally. This is what prompted much of the borrowing from French, Latin, and 
Greek into English in the Early Modern English period, and from Chinese into 
Japanese in the Middle Ages. The instrumentalization of vernaculars as official 
and national languages has also led to heavy lexical borrowing in order to fill 
gaps in the lexicon. 

Borrowing also tends to be motivated by considerations of prestige, which explains 
why socially subordinate languages tend to borrow more from dominant languages 
than vice versa. A frequently cited example of this is the borrowing of items like 
pork, beef, veal, etc., from French into Middle English, which supplemented native 
words like pig, cow, etc. (as labels for the processed forms of the meat in question). 
In general, however, concepts such as “need” and “prestige” are only one part 
of the explanation for lexical borrowing. In addition, the motivations for borrowing 
have to be understood in relation to the sociolinguistic and sociopolitical aspects 
of the contact between the speakers of the languages. Such factors include the 
patterns of social interaction between the groups, the degree of bilingualism, the 
demographics and power relationships, and attitudes toward the languages. 
Very often, bilingual situations that seem quite similar at first sight turn out to be 
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quite different in the degree to which they promote lexical borrowing. Moreover, 
rates of borrowing correlate with differences in social class and neighborhood 
(Poplack, Sankoff, & Miller, 1988), as well as educational history, types of social 
network, etc. (Treffers-Daller 1994). 

Finally, language loyalty and language ideology are important factors that can 
constrain borrowing. Loyalty to one language and pride in its autonomy promotes 
resistance to foreign incursions. Indeed, some nations have enacted language poli- 
cies and even legislation to preserve the “purity” of their languages. A well-known 
example is the effort by the French to limit or eradicate foreign, especially English, 
loanwords from their language. Language loyalty, of course, goes hand in hand 
with perceptions of group identity. Alsatian—Prench bilinguals in Strasbourg per- 
ceive language mixture as a reflection of their ethnic identity and are therefore 
more tolerant of borrowing and code-switching. By contrast, speakers of French 
and Dutch in Brussels mix their languages much less, since they perceive them- 
selves as distinct in both ethnic and linguistic terms. A similar desire to preserve 
autonomy explains why speakers of Tariana actively resist overt borrowings from 
East Tucanoan languages, though they seem unaware of the fact that the latter 
have had significant but less visible impact on the abstract structure of their 
language. Finally, considerations of social mores or taboo can lead to restrictions 
on borrowing. For instance, it is taboo for Nguni women to pronounce the names 
of senior male relatives such as their fathers-in-law. In some cases, even the 
syllables contained in such names must be avoided. This practice is referred to 
as hlonipa, which conveys the sense of “respect through avoidance.” According 
to Herbert (1995: 59), Nguni speakers accomplished this either by substituting other 
native sounds or foreign sounds such as Khoesan clicks for the sounds that had 
to be avoided. In some cases, inherited words were replaced with a foreign hlonipha 
alternative. 


6 Linguistic Constraints on Borrowing 


We’ve already seen that the borrowing of open-class lexical items such as nouns, 
adjectives, and verbs is fairly common. Part of the reason for this is that the 
lexicon lies on the less “stable” end of the stability gradient. By contrast, the greater 
structural cohesiveness of closed-class items makes them less amenable to 
borrowing. But other constraints operate as well. For instance, the degree of typo- 
logical distance between the languages in contact may act to promote or inhibit 
the borrowing of even open-class items. It appears that lexical categories that exhibit 
a higher degree of morphological complexity tend to resist borrowing more. This 
may explain why, for instance, verbs tend to be borrowed less often than nouns 
or adjectives. On the other hand, if there is congruence in verbal structure across 
the languages, borrowing is facilitated. For instance most French verbs borrowed 
into Brussels Dutch tend to be from the -er class, since these lend themselves more 
readily to incorporation into the class of regular Dutch verbs whose infinitival 
suffix is -en (e.g. blesseren ‘hurt’ < French blesser). In cases where the morphological 
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complexity of verbal structure in the RL acts as a barrier to borrowing of verbs, 
many languages get around the problem by creating a compound verb consisting 
of a borrowed stem and a native verb meaning ‘do’ or ‘make’, which carries the 
necessary inflections (e.g. English help + Hindi karne ‘do’ = ‘help’). 


7 Constraints on Borrowing of Structural 
Elements 


As we saw earlier, the borrowing of bound structural elements is comparatively 
rare. Again, the reasons for this have to do with the fact that such elements are 
more tightly integrated into the various subsystems of the grammar, and may not 
be easily isolated. In general the constraints on borrowing of such items fall into 
three categories. First, such borrowing is more likely if there is sufficient congruence 
between morphological structures across the languages involved. For instance, the 
borrowing of several verbal suffixes from Ritharngu into Ngandi in Arnem Land, 
Australia, was facilitated by the fact that the two languages shared a similar verb- 
inflectional pattern (Heath 1978). Second, morphemes that are “transparent” in 
form and function tend to be borrowed more readily than those that are more 
opaque. Transparency is a function of whether the morpheme in question is eas- 
ily isolatable, and has a clear and consistent meaning wherever it appears. Third, 
the existence of gaps in the morpheme inventory of an RL can promote import- 
ation of new morphemes to fill such gaps. The fact that all of these conditions 
are often not satisfied in situations of language contact may explain why direct 
borrowing of overt structural elements is rare. Even when it occurs, it has little 
impact on the overall grammar of the RL. For instance, while borrowing a few 
verbal suffixes from Ritharngu, Ngandi still preserved all of its own verbal 
suffixes marking tense, aspect, mood, and negation (Heath 1981). 


8 Is There Borrowing of Structural Patterns? 


The picture is quite different when we look at the transfer of structural patterns, 
grammatical categories and functions, which occurs with surprising regularity in 
situations of contact. No one disputes this fact, but there is sharp disagreement 
over the role played by borrowing in such kinds of transfer. On the one hand, 
there is a widely held view that only the transfer of overt grammatical morphemes 
constitutes borrowing in the strict sense. Other kinds of structural transfer are 
referred to by other names, such as “pattern transfer,” “grammatical replication,” 
“indirect diffusion,” etc. The hesitancy to refer to these kinds of change as bor- 
rowings perhaps reflects a traditional view among historical linguists that most 
aspects of grammar are highly resistant to change — a view that is in keeping 
with the idea of the stability gradient in language. On the other hand, Thomason 
and Kaufman (1988: 74) claim that there is a scale of borrowing ranging from 
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relatively slight lexical borrowing to “heavy” structural borrowing in situations 
of very intense contact. We therefore need to reconcile the two positions as far 
as we can. 

There are two types of contact situation in which Thomason and Kaufman claim 
that significant structural borrowing can occur. The first includes those in which 
one language “borrows” the entire grammar of another language while preserv- 
ing most of its native lexicon, thus resulting in a new mixed language. The second 
type of situation involves long-term contact in the course of which one language 
converges gradually toward another in all aspects of structure. 

As an example of the former type of situation, Thomason and Kaufman cite cases 
like Ma’a, a language spoken by the Mbugu people of Tanzania, and Angloromani, 
spoken by the Roma of Gypsy groups in the British Isles. The grammatical frame 
of Ma’a is more or less identical to that of Mbugu, the Bantu language which the 
group also speaks, while its lexicon comes from other sources, chiefly Eastern 
Cushitic, the language from which the group apparently shifted. Thomason and 
Kaufman argue unequivocally that “almost all of the original Cushitic grammar 
and at least half. . . of the Cushitic vocabulary have been replaced by Bantu gram- 
mar and lexicon” (1988: 49), and claim that this came about through massive struc- 
tural borrowing. But the idea that a language can, over the years, gradually borrow 
the entire structural apparatus of another language, overt grammatical elements 
and all, is extreme, to say the least. It flies in the face of all we know about the 
strong constraints on the borrowing of overt structural elements, far less an 
entire inventory of such elements. Bakker (2003: 137) offers compelling evidence 
against this point of view, noting that “it is impossible to follow the pathways of 
borrowing outlined in Thomason and Kaufman and end up with a language like 
Angloromani or Ma’a.” A far more feasible explanation is that languages like these 
arose through massive lexical borrowing into the new language that speakers 
had already adopted, Mbugu in the case of Ma’a, and English in the case of 
Angloromani. Further arguments for this scenario will be presented below. 

The second type of situation that Thomason and Kaufman cite as a case of heavy 
structural borrowing is that in which a sociopolitically dominant language has 
exerted intense “cultural pressure” on the native language of a group over a long 
period of time. They claim that, under such conditions, a wide range of struc- 
tural innovations can be borrowed at all levels of structure. The upper end of their 
borrowing scale lists changes like the following: in phonology, new syllable 
structures, new allophonic and morphophonemic rules, loss or addition of 
phonemic contrasts, etc.; in morphology, the adoption of foreign inflectional 
affixes and categories, new word structure rules, agglutinative morphology to 
replace flexional morphology, etc.; in syntax, new word order, agreement rules, 
embedding strategies, and so on. But when we look closely at the situations in 
which such transfer occurs, we find that they do not conform to the criteria for 
borrowing as defined here, particularly with regard to the linguistic dominance 
relationship between the languages in contact, and the kind of agentivity that is 
involved. Such situations usually involve high degrees of bilingualism among the 
RL speakers. Thomason and Kaufman (1988: 66) in fact claim that “the traditional 
prerequisite for structural borrowing .. . is the existence of a bilingual group within 
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the borrowing language speaker population.” They further claim that structural 
borrowing is initiated by native speakers of the RL. But this begs the question 
as to what transfer type and agentivity is involved. The point is that the native 
speakers of the RL are also proficient in the SL, thus opening the door for trans- 
fer under both RL and SL agentivity, that is both borrowing and imposition. There 
is an unfortunate tendency among some scholars to equate situations of language 
maintenance with borrowing, and to assume therefore that even extreme struc- 
tural changes in a maintained language must be due to borrowing. The fact is 
that many such situations involve ongoing shift in which speakers of an ances- 
tral language are becoming increasingly proficient, and eventually dominant, in 
the other language, resulting in growing transfer of linguistic habits from the 
latter into the ancestral language. 

A well-known situation of this type is the contact between Greek and Turkish 
in Asia Minor, which lasted for hundreds of years until the Turks expelled most 
of the Greeks from the region in 1922. Contact with Turkish led to significant changes 
in Asia Minor Greek at every level of structure. Many of these changes, particu- 
larly in the lexicon, were no doubt due to borrowing in earlier stages of the contact. 
But as Greeks became increasingly proficient and even linguistically dominant in 
Turkish, structural transfer became even stronger, leading to pervasive changes 
in the phonology, morphology, and syntax of Asia Minor Greek (Janse, to appear). 
Structural changes in Greek phonology included the adoption of new Turkish 
vowels, the loss of dental fricatives (not found in Turkish), the introduction of 
rules of vowel harmony and allophonic variation, etc. Morphological changes 
included a new type of noun declension, loss of gender distinctions in nouns and 
adjectives, introduction of a new tense and a new mood category, etc. In addi- 
tion, bound morphology was replaced by periphrastic constructions to convey 
certain tense and mood categories, on the model of Turkish. There were also 
various innovations in Greek syntax, involving word order, copula constructions, 
and interrogatives, etc. In short, the strong and pervasive influence from Turkish 
led to significant structural change in Asia Minor Greek which went far beyond 
what we would expect of borrowing under RL agentivity. Yet, Thomason 
and Kaufman (1988: 218) argue that “if Turks did not shift to Greek, all of the 
interference must be due to borrowing.” This highlights once more the need to 
distinguish the agents of change from the kinds of agentivity that are employed 
in situations of contact. It is just as feasible that the agents of change were 
Turkish-dominant Greeks, who were imposing Turkish strategies on the declining 
ancestral language in which they were losing proficiency. 


9 Borrowing and Other Contact Phenomena 


The view of borrowing as a process that involves RL agentivity allows us to 
link phenomena that have been interpreted in very different ways in the litera- 
ture. Among them are classic code-switching, relexification and the creation of 
bilingual mixed languages. The position taken here is that borrowing is the trans- 
fer type that has played the key role in these kinds of language mixture. 
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10 Borrowing and Classic Code-Switching 


In classic code-switching, a speaker retains the morphosyntactic frame of his 
dominant language (as the RL), and imports single content morphemes or 
phrases from an external SL, as in the following example of French—Arabic 
switching (Bentahila & Davies 1983: 319): 


(2) C’est une pauvre bint. [Arabic item in italics] 
‘She is a poor girl.’ 


Myers-Scotton (1993, 2002) has shown that, in these cases, one language, the RL 
in our terms, acts as the matrix language, that is, provides the morphosyntactic 
frame for the utterance. Moreover, the items incorporated from the embedded lan- 
guage or SL consist mainly of open-class content morphemes, which are inflected 
according to the rules of the RL. In the following example of Swahili-English code- 
switching, the English stem decide is treated just like a Swahili verb, and inflected 
accordingly (Myers-Scotton 1993: 4): 


(3) Hata siku hizi ni-me-decide kwanza kutumia sabuni ya miti 
even days these 1sg-PERF-decide first to use soap of stick 
‘[But] even these days I’ve decided first to use bar soap.’ 


Some researchers have attempted to distinguish such single morpheme insertions 
from borrowings by appealing to criteria such as frequency of use by monolinguals 
and degree of morphological integration of the inserted item. But frequency 
counts are inconclusive, and the distinction between a switch and a borrowing 
is not transparent to bilinguals. The criterion of integration is also shaky, since 
both word switches and borrowings may or may not be adapted to the phono- 
logy and morphology of the RL. Moreover, there is no difference in the types of 
lexical categories that can be switched or borrowed, and the same hierarchy of 
borrowability applies to both. It seems best, therefore, to treat lexical switches and 
lexical borrowings as manifestations of the more general phenomenon of borrowing 
under RL agentivity. While the results may differ in some ways, the same under- 
lying process is involved. 


11 Borrowing, Relexification, and 
Mixed Languages 


Understanding borrowing in terms of processes rather than results also helps 
us to explain how certain kinds of contact language were created in the past. In 
some contact situations, the process of incorporating content morphemes into 
an RL can be taken to an extreme, resulting in new languages that derive their 
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morphosyntactic frame from one language, and their lexicon from another. These 
creations are referred to as intertwined languages, or “Lexicon-Grammar mixed 
languages” (Bakker 2003). Bakker claims there are about 25 such documented 
cases. Perhaps the best-known example is Media Lengua, a language spoken in 
central Ecuador, in which almost the entire grammatical frame is Quechua, but 
practically all content morphemes or lexemes derive from Spanish. The following 
example illustrates (Spanish items in italics): 


(4) ML: Unu fabur-ta  pidi-nga-bu __bini-xu-ni (Muysken 1997: 365) 
one favor-ACC ask-NOM-BEN come-PROG-1 
‘T come to ask a favor.’ 
Q:  Shuk fabur-damafia-nga-bu shamu-xu-ni 
Sp: Vengo para pedir un favor. 


As can be seen, the process of incorporation and integration of Spanish stems is 
identical to what we saw in the cases of classic code-switching above. Muysken 
referred to this process as “relexification,” describing it as “the process of vocabu- 
lary substitution in which the only information adopted from the target language 
[source language — DW] in the lexical entry is the phonological representation” 
(1981: 61). It seems clear that the kind of lexical incorporation Muysken describes 
is in principle the same as that which occurs in lexical borrowing and classic code- 
switching. In all cases, SL lexical forms are imported and integrated into the 
unchanged structural frame of an RL, via RL agentivity. 

Failure to understand the similarity in the underlying process may explain why 
some researchers see little connection between code-switching and the emergence 
of intertwined languages. Thus Golovko (2003: 196) states that “the idea of code 
switching as an initial start for emergence of mixed languages should be rejected,” 
since, structurally, “there is not very much in common between the two.” Bakker 
(2003: 129) agrees with this, on the grounds that the quantity of embedded lexicon 
in such languages is far greater than in ordinary code-switching, and there seem 
to be no outcomes that occupy the middle ground between the two. But such objec- 
tions are based on comparisons of the results, rather than of the processes 
involved. The differences among the outcomes have to do with the degree to which 
the processes apply, and the extent to which the switches become conventional- 
ized as fixed lexical selections. These are matters of social convention, not linguistic 
process. Moreover, contra Bakker, we do find mixed languages that represent a 
middle stage in the continuum between classic code-switching and intertwined 
languages. They include Chamorro and Maltese, which have adopted a great deal 
of vocabulary from Spanish and Italian respectively. Yet in neither case has the 
massive lexical borrowing affected the inherited grammar of the language — 
Semitic (Afro-Asiatic) in the case of Maltese, and Austronesian in the case of 
Chamorro. Moreover, similar mixed languages such as Tsotsitaal, Isicamtho, and 
Sheng are still in process of emerging even now. These facts confirm that there 
are no clear grounds on which to distinguish the various types of bilingual mixture 
in terms of the processes that created them. 
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Intertwined languages also conform in other respects to the constraints on struc- 
tural borrowing that regulate classic code-switching. In both cases, only some kinds 
of overt function items such as articles and some prepositions can be incorpor- 
ated into the RL. But there are strong constraints on the incorporation of most 
system morphemes, particularly bound ones. Prototypical mixed languages like 
Media Lengua and Ma’a follow these constraints. However, there are other mixed 
languages whose grammatical frame is a composite blend of structural morphemes 
from the two input languages, and which therefore fail to conform to the grammar 
lexicon split of the prototypical cases. Two well-known examples are Michif, which 
combines French NP structure with Cree VP structure, and Mednyj Aleut, a blend 
of Russian and Aleut. The composite grammars of languages like these pose much 
difficulty for the notion of borrowing as the transfer type responsible for their 
creation. 

Michif in many ways recalls another kind of code-switching in which phrasal 
constituents or “islands” from an SL are embedded into the grammatical frame 
of an RL. But the creators of Michif took this process to an extreme, and con- 
ventionalized the combination of French NP islands with Cree VPs. We could 
perhaps conceive of this as a product of borrowing in the broad sense, if we allow 
it to include the incorporation of phrasal constituents. Mednyj Aleut poses more 
of a problem, since it has incorporated both bound and free structural elements 
from Russian and Aleut in all components of the grammar, though most of its 
lexicon comes from Aleut. Russian supplies the tense/mood/aspect morphology, 
coordinating and subordinating clause combining devices, word order, etc. Aleut 
provides nominal inflections, adpositions, a copula verb, purpose markers, verb 
derivations, etc. This kind of structural mixture appears unique among con- 
ventionalized mixed languages, and does not fit any pattern of borrowing as 
defined here. 

The continuum of mixture across bilingual mixed languages reflects the hier- 
archy of constraints on structural borrowing which have been observed for both 
lexical borrowing and code-switching. Myers-Scotton has observed that what she 
terms “early system morphemes” such as determiners are often transferred from 
the SL into the matrix language in classic code-switching. By contrast, “late system 
morphemes” such as case markers, agreement markers, and others that convey 
grammatical relationships across constituents are practically never transferred in 
the classic cases. Transfer of this type is more typical of processes of structural 
convergence, which is distinct from classic code-switching, and more similar to 
imposition than to borrowing. Matras proposes a very similar hierarchy of struc- 
tural transfer for bilingual mixed languages (Matras 2003), suggesting that dif- 
ferent “layers” of structure characterize different types. In the prototypical cases, 
such as Media Lengua, we may find a first layer consisting of structural elements 
such as derivational affixes that have been brought along with lexical insertions. 
In addition a second layer is often found in these languages, consisting of items 
such as lower numerals, personal pronouns, deictics, negators, and the like, which 
have been also imported from the lexifier language. These, of course, correspond 
closely to the free structural elements that lend themselves more readily to 
borrowing, as we saw earlier. Finally, in some extreme cases, such as Mednyj Aleut, 
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a third layer of structural elements can be imported, consisting of agreement 
markers, tense and aspect morphology, case markers, and the like, in other 
words, late system morphemes. These features are very rarely transferred across 
languages, and it is difficult to conceive of them as borrowings. The circumstances 
in which such transfer occurs usually involve highly skilled bilinguals who have 
the ability to deliberately mix structural elements from their two languages. The 
languages that result represent new creations in which the usual constraints on 
both lexical and structural transfer are relaxed to an extreme. They appear to 
be the result of a conscious act of “folk linguistic engineering” (Golovko 2003) 
or “change by deliberate decision, [which] is a quintessentially social factor” 
(Thomason 2003: 35). The motivation for such manipulation of linguistic resources 
appears to have been the desire of a bilingual group to signal its own unique 
identity through use of a mixed language. 


12 Conclusion 


In summary, the processes of borrowing can lead to a wide range of linguistic 
outcomes, from limited adoption of lexical items, to classic code-switching, to heavy 
adoption of lexical materials along with some overt structural elements, and finally 
to the creation of mixed languages with lexicon drawn from one language 
and grammar from another or indeed mixed grammars in the cases of Mednyj 
Aleut and Michif. All of these outcomes share the mechanism of RL agentivity, 
whereby speakers import lexical and some structural material from an SL into 
an RL whose grammatical frame they preserve more or less intact. Focusing on 
borrowing as a process rather than a result allows us to make connections among 
all of these contact phenomena, and helps us to avoid problems of indeterminacy 
in our definition and classification of types of contact-induced change. 


NOTES 


1 Some scholars avoid the term “borrowing” altogether. For instance, Johanson (2002, this 
volume) uses the term “code copying” to refer to all kinds of cross-linguistic influence, 
claiming that terms like “borrowing,” “ 
vague and misleading (2002: 288). 

2 This is a crucial difference between the approach used here and that of Johanson (2002), 
who distinguishes two types of “code-copying,” which he calls “adoption” and “impo- 
sition.” The former is defined as copying from a sociolinguistically dominant code into 
a sociolinguistically dominated code, while the latter involves copying in the opposite 
direction (2002: 290). Since this distinction rests entirely on social dominance relation- 
ships, it is fundamentally different from the distinction made here between borrowing 
and imposition, which rests on linguistic dominance relationships. 

3 French influence was confined to derivational morphology, and only those borrowed 
French suffixes that were semantically and morphotactically transparent became pro- 
ductive in English (Dalton-Puffer 1996: 224). 


wus 


transfer,” “interference,” and the like, are 
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9 Contact and 
Code-Switching 


PENELOPE GARDNER-CHLOROS 


1 Introduction 


This chapter explores various aspects of the relationship between code-switching 
(henceforth CS) and language contact. CS occurs in contact situations of many 
types and relates in complex ways to both language shift — the displacement of 
one variety by another — and language change — meaning changes to the fabric 
of the varieties themselves. The relationship between these phenomena, described 
in sections 2 and 3, is an indirect one (as is the relationship between shift and 
change itself). In section 3.3, an example is given of the way in which this 
relationship is articulated in practice via intervening factors — in this case gender 
and politeness - in the context of young second-/third-generation Greek 
Cypriots in London. 

It is a fairly safe assumption that some CS will occur in most, if not all, contact 
situations. It is found among immigrant communities, regional minorities, and 
native multilingual groups alike. Gumperz and Hernandez wrote that it could 
be found “each time minority language groups come into contact with majority 
language groups under conditions of rapid social change” (1969: 2). Others (e.g. 
Giacolone Ramat 1995) have on the contrary described it as a feature of stable 
bilingualism in communities where most speakers can speak both languages. There 
are descriptions of contact situations in which it receives little or no prominence 
(e.g. Jones, 1998, with respect to Wales; Spolsky & Cooper, 1991, on Jerusalem), 
but this does not necessarily mean it does not occur in those settings. Other 
consequences of contact or restructuring may be the primary focus, or the data 
collection techniques may not center on the informal conversational modes 
where CS occurs. 

CS can be associated with different configurations within a contact situation, 
from accommodation to divergence and from language maintenance to language 
shift. It reflects social differences and tendencies within the same society and 
language combination (Bentahila & Davies 1991; Li Wei 1998; Treffers-Daller 
1998), just as it reflects those between different societies and different language 
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combinations (Poplack 1984; McClure 1998). It should be seen as part of a “bigger 
picture” which includes other forms of register variation (Halmari & Smith 1994). 

Contact in the form of CS has mainly been investigated in spoken contexts, 
and it has doubtless occurred in this form throughout history (for example Stolt 
1964 on German-Latin CS in Luther’s “table-talk”). But it is also found in written 
texts (Montes-Alcala 1998; Sebba 2005). Examples described in the literature 
include Latin-Greek CS in Cicero’s letters to his friend Atticus, French-Italian 
in a thirteenth-century Coptic phrasebook and a fourteenth-century Venetian 
manuscript of the Song of Roland (Aslanov 2000); German-Latin in the work of 
a seventeenth-century linguist, Schottelius (McLelland 2004), English-French CS 
in a variety of Medieval English texts (Trotter 2002); literary works such as 
Chicano poetry (Valdes-Fallis 1976) and novels where CS speech is represented, 
such as Tolstoy’s War and Peace (Timm 1978), Eco’s The Name of the Rose (where 
Salvatore speaks a multilingual jargon), and in contemporary novels such as 
those by Zadie Smith (English-Creole in White Teeth) and John Markovitch 
(Spanish-Quechua in The Dancer Upstairs). Script-switching is also found in 
many contexts. In the sixteenth century, when van Eyck wrote “Als ich kann” in 
German (‘as (well as) I can’) above his self-portrait, he did so in Greek letters: 
OAS 1% xXaV, probably to indicate his erudition (also punning on his name, 
Eyck/ty). Angermeyer (2005) has described script alternation in contemporary 
Russian-American advertising, which is a way of signaling a bilingual identity. 
A Hassidic newsletter distributed in North London, Hakohol, includes code- 
switches from Yiddish and Hebrew, the latter in Hebrew script. The lifestyle 
magazine Latina, aimed at American women of Hispanic descent, is full of 
Spanish-English CS. Mahootian (2005) describes this as a reflection of commu- 
nity norms, but also as challenging relations of power and dominance between 
the older Hispanics, the monolingual English-speaking community, and the 
younger Hispanic generation.’ CS in writing is symptomatic of certain types of 
written discourse taking on the informality of conversation, as in the email and 
texting practices of many young people of mixed heritage (Hinrichs 2006). There 
is even a “mini-genre” of “bivalent” texts, dating from the sixteenth to sevententh 
centuries, composed so that they could be read simultaneously in Latin and vari- 
ous Romance languages, notably Spanish — a way of punning with linguistic affili- 
ation rather than with meaning (Woolard & Genovese, 2007). So while the written 
language is generally considered more resistant to nonstandardness than the 
spoken language, and CS may be considered a nonstandard mode of expression, 
its functions are so diverse that it in fact enters the written language in many forms. 


2 CS in Relation to Language Shift/ Vitality 


2.1 Convergence versus preserving distinctiveness 


One view of CS is that it is a “compromise strategy” in bilingual settings, and 
thus ultimately a mechanism for bringing the varieties closer together. On the other 
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hand, it is also contrasted with other instances of language contact, such as 
borrowing, and defined quite on the contrary as the one type of language inter- 
action where each variety preserves its character. In fact it is clear from data 
collected in many communities that there is no one-to-one correlation between 
CS and language change. Thomason (2001) has considered this question at length 
and concludes that CS is not a major mechanism in contact-induced change, as 
it does not result from “imperfect learning,” though it is one of the principal mech- 
anisms of borrowing. She points out how varied and heterogeneous the sources 
of language change are, and that most of the general rules are there to be broken. 
To take an example, language change often takes place when a minority group 
becomes bilingual and adopts features of the L2, as in the case of the Greek com- 
munity in Asia Minor before 1922, who adopted numerous features of Turkish. 
This conforms to the general rule that the language of the majority is adopted by 
the minority rather than vice versa. However, this tendency is in conflict with 
another one, i.e. that the language of the elite is adopted by the subordinate 
group, and the elite is, almost by definition, the minority. Thus it seems that native 
speakers of Turkish — i.e. the majority - also played a role in bringing features 
of Turkish into Asia Minor Greek, because some of them had learnt Greek as a 
second language (2001: 67). 

Backus (2005) explores the complex relationship between CS and structural 
change. He lists numerous problems, both methodological and conceptual, with 
the hypothesis that structural change is directly brought about by CS. For example, 
at a methodological level, one needs to have a complete picture of the range of 
variation of the structure within the monolingual variety in order to establish 
that it was indeed CS, and only CS, which had brought about the change. At a 
conceptual level, one needs to understand how exactly the fact of code-switching 
could bring about a change in the structure of a language, and distinguish this 
from calques in the idiolect of individual code-switchers. 


2.2 Studies which show CS to be bound up with shift 


Despite the examples which show that CS is not invariably symptomatic of shift, 
in many bilingual communities it may nevertheless be identified as such. This is 
notably the case when there is a correlation between speakers’ age and the type 
of CS which they use. Bentahila and Davies (1991; 1998) analyzed the CS of 
Arabic—French bilinguals in Morocco, and found clear differences in the patterns 
characterizing the younger and the older groups, which were related to differ- 
ences in the role played by French in their education and background. In a study 
of Arvanitika, a dying Albanian dialect spoken in Greece, Trudgill (1976-7) found 
that CS into Greek was used as a compensatory strategy by younger speakers 
who were less competent in Arvanitika (which made it difficult to identify which 
aspects of Arvanitika were being lost). Lavandera (1978) reported a similar phe- 
nomenon in Buenos Aires where migrants of Italian origin switched between 
their own variety of Argentinian Spanish, Cocoliche, and Standard Argentinian 
Spanish to compensate for reduced stylistic options available to them in either 
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variety. Schmidt (1985) contrasted the older, more fluent speakers of Dyirbal, a 
dying Aboriginal language of Australia, who only used individual words in English, 
e.g. all right, now, finish (= period), with moderately proficient speakers who use 
many more imported English words, often adapted to Dyirbal morphology, e.g. 
ring-iman ‘she phoned him’, jayil-gu ‘to jail’, one night, down there, etc. Finally, 
the younger speakers, who were much less fluent in Dyirbal, code-switched 
copiously between the two varieties: 


(1) We tryin’ to warn ban wuigi nomo wurrbay-gu. 
‘We were trying to warn her not to speak [Dyirbal].’ 
(Schmidt 1985) 


There is also historical evidence which suggests that CS can be a contributory 
factor in language shift. This has notably been studied in the context of Middle 
English, which relexified? under the influence of Norman French and Latin. 
Bilingualism and CS must have played a major role in the process of lexical 
borrowing and mixed-language texts provide interesting information on the 
process of widespread relexification of English in the Middle English period. 
“Generations of educated Englishmen passed daily from English into French and 
back again in the course of their work, a process which must have led to specific 
lexical transfers both in the field of technical and of general vocabulary” (Schendl 
2002: 86). Schendl’s examples of CS between English, French, and Latin from a 
variety of medieval genres (business, religious, legal, and scientific texts) support 
this claim. 


2.3. CS and the processes of language contact 


It is most accurate to view CS as taking place in a context where there is shift in 
progress, rather than constituting shift of itself. Within a given contact situation, 
convergent and nonconvergent tendencies can coexist. This was shown for example 
in Gumperz’s early study of Hindi-Punjabi CS in Delhi (1964), where CS persisted 
at a lexical level although the two - closely related — varieties had substantially 
converged at a grammatical level after decades of contact. Gumperz concluded 
that the contrast between them was functional, as the languages would otherwise 
have converged entirely. 

A more recent study which highlights the complexity of CS in particular situ- 
ations is Rindler-Schjerve’s (1998) study of Italian and Sardinian. She emphasizes 
that although the switching occurs in a context of language shift, it “should not 
be seen as a mechanism which accelerates the shift” (1998: 247). If CS was a mech- 
anism of change, one would expect the less proficient bilinguals to code-switch 
most. But in Sardinia, it is the more balanced bilinguals who switch most, and 
who also “contribute to the maintenance of Sardinian in that they change the 
Sardinian language by adapting it to the majority language thus narrowing the 
gap between the two closely related codes” (1998: 246). The Italian expression 
secondo me (= ‘in my opinion’) is inserted below into a Sardinian sentence, but 
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adapted to Sardinian phonology in order to minimize the transition (segunnu me). 
Its function is to highlight or separate the parenthetical expression ‘in my opinion’. 


(2) Non m’an giamadu ‘e veterinariu ma segunnu me 


‘They didn’t call a vet but in my opinion 
fi calchicosa chi a manigadu 
it was something which it has eaten.’ (Rindler-Schervje 1998: 243) 


Thus CS can arise in situations of widely varying stability. The contact which gives 
rise to it can be of differing durations, and can affect different sections of the com- 
munity in different ways. Change can be fast or slow and can affect all aspects 
of language (Gumperz & Wilson, 1971; Dressler & Wodak-Leodolter 1977). It can 
take place over several generations or, effectively, within a single generation. It 
may be difficult to detect, losses being masked by code-switches (Trudgill, 1976-7) 
or internal restructuring (Tsitsipis 1998), and it can occur either with heavy lin- 
guistic symptoms such as morphological loss or without them (Dorian 1981; Schmidt 
1985). A range of linguistic configurations can arise along the road to extinction, 
CS being only one of the possible scenarios. 

Auer (1999) sees CS as the first point in a chronological progression along a 
continuum. At the CS stage, the point in the sentence where there is a switch is 
a significant aspect of the conversation. The next stage is language mixing, where, 
as in example (2), it is not the individual switch points which carry significance, 
but the use of the overall switching mode - this stage is also described by Myers- 
Scotton (1993) as “switching as an unmarked choice.” The third stage is that of 
fused lects, which are stabilized mixed varieties. This coincides with a loss of vari- 
ation: the use of elements from one or other variety is no longer a matter of choice, 
but of grammatical convention. Structures from each variety, which are equiva- 
lent in monolingual usage, develop specialized uses. Auer claims this process is 
unidirectional. It may never be completed, as bilingual communities may stabilize 
at any point along the way, but it does not allow for any movement in the oppo- 
site direction. Auer goes so far as to say that the movement from fused lects to 
language mixing to CS is “prohibited.” 

Clearly, the social circumstances which give rise to mixing in the first place 
cannot be “put into reverse”: English is not likely to split up into Anglo-Saxon, 
Norman French, etc. On the other hand, if the starting point is two varieties which 
are somewhere toward the middle of the continuum, say somewhere between CS 
and language mixing, then these could in theory split or become more similar to 
(one of) the component varieties. This is akin to what happens in decreolization. 
If we take the example of the Creole-London English mixture described in Sebba 
(1993), one of the elements, London English, has an independent existence anyway, 
and it is conceivable that the creole could begin to be used more independently, 
as it could feed off the creoles which originally gave rise to it. 

As we have seen, different contact processes, and different kinds of CS, can co- 
exist in the same community at a given time. We should distinguish situations 
where there is a progression from one state to another from those where the 


Contact and Code-Switching 193 


different phenomena simply overlap chronologically. As Crowley pointed out with 
respect to pidginization in Bislama (1990: 385): “Rather than trying to divide the 
history of Melanesian Pidgin into chronologically distinct developmental phases, 
we should regard stabilization, destabilization, and creolization as all contributing 
simultaneously to the gradual evolution of the language from the very beginning.” 

There are also circumstances where contact-induced change does not proceed 
smoothly through all three stages. After some convergence has taken place, 
instead of the old variety being abandoned in favor of the new, the altered (code- 
switched) variety, brought about through contact, may assume distinct functions 
of its own. This is relevant to the Delhi case mentioned above (Gumperz 1964), 
and may explain the formation of certain creoles. 


2.4 Stabilization of CS varieties 


CS varieties may stabilize when they assume an identity function. They are often 
characteristic of young second-generation immigrant communities which develop 
a pride in their mixed identity. Such an intracommunity variety is, for example, 
emerging among the second-generation Portuguese in France. Known as immigrais 
‘immigrese’, by their own description a “bastard” Portuguese, including French 
words and colloquialisms, this variety is intolerable to purists, but understood 
by the 900,000 strong immigrant population. In Albatroz, a literary review trilin- 
gual in Portuguese, French, and Immigrais, the proponents of this variety write: 


Through no will of our own, we’re foreigners. But we have a tool: the Portuguese 
language contaminated by positive pollution. Be that as it may, we write in both 
languages, copulating frenetically, with the outcome that literary and pictorial 
objects are pleasantly produced. It'll be alright. We respect none of the spelling or 
vocabulary rules of the immigrese language and although we may perturb, we demand 
that making mistakes be recognized as a technique for exploring the ambiguity of a 
text or as a gimmick of polysemic amplification. Don’t you agree? And while we're 
at it, let’s do away with the cedilla! Irreverence, dear reader, irreverence.’ (Munoz 
1999: 31, my translation) 


Hewitt (1986) and Sebba (1993) have described types of CS which occur with 
symbolic and discourse-related functions in London, between creole- and London 
English-speaking young people. Although used sparingly, creole is said by Sebba 
to have a we-code function and CS is described as an “insider activity.” Creole is 
almost certainly preserved by being used in this way, as the majority of young 
people who were the subject of his study would not be able to use it as a means 
of expression on its own. Zadie Smith’s White Teeth portrays the lives of young 
Afro-Caribbeans in London who show this type of CS as an intrinsic — if somewhat 
inconsistent — aspect of their speech. 

There are also an increasing number of popular singers and bands whose lyrics 
are code-switched, e.g. Ricki Martin, Raggasonic, and various Punjabi bands in 
Britain (Asian Dub Foundation, Bali Sagu, Apache Indian), attesting the vitality 
which code-switched varieties can have in their own right. Raggasonic for example 
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is a Rasta band who sing in English, French, and a mixture of both. Along with 
English-French CS (‘la dancehall est full’), their lyrics contain morphologically 
adapted words, borrowed from one language to the other (e.g. toaster); elements 
of creole (bwoy) and of verlan, a type of slang which involves inverting the first 
and second syllables of multisyllable words and pronouncing single syllable words 
backwards (from French envers = ‘back to front’), e.g. checla, from “clasher,” ‘to 
clash’, already a code-switched word, or ssepa, from the French passe, ‘passes’. 

Prescriptive “rules” such as speaking only one language at a time are deliber- 
ately and playfully broken — a function of CS which has been described as 
“ludic” (Caubet 2001; McCormick 2002). The mixing is not a precursor of any of 
the varieties involved disappearing, at least not among the writers and consumers 
of such lyrics, who are fluent multilinguals — if anything it is the purists who should 
feel threatened! The various functions of CS in rap music are explored in Sarkar 
and Winer (2006) and in North African-French rai music by Bentahila and Davies 
(2002). Both claim that one of its functions is to reconcile the conflicting trends of 
localization and globalization. 

This again reinforces the claim that CS can occur both in situations of decline 
and as a mechanism of vitality. An example of the first is provided by Aikhenvald 
(2002), who shows how an Amazonian community, the Tariana, have a strong 
inhibition against any kind of mixing of their language with that of other local 
tribes, such as the Tucano and the Maku, whom they despise. When such mixing 
occurs, it evokes ridicule and pity. However, they are increasingly allowing CS 
with Portuguese and even with English, which symbolizes “everything a capitalist 
paradise could offer” (2002: 211). By contrast, CS with a much more positive 
identity function is illustrated by Gibbons (1987), who describes a variety called 
“MIX” spoken by students at Hong Kong University. Unlike a creole, MIX has 
emerged within a single community and the processes which have occurred have 
taken place entirely within this relatively homogeneous group. 

Heller (1988) suggests as a priority for CS research looking at “the extent to 
which different types of CS are related to different types of boundary maintenance/ 
change processes” and “the generalizability of findings concerning the social con- 
ditions under which CS is or is not found” (1988: 268-9). More comparative research 
is needed for us to realize this aim. 


3 Code-Switching and Language Change 


The role of CS, along with other symptoms of contact, in language change is still 
a matter of discussion (Clyne 2003; Harris & Campbell 1995). On the one hand the 
relationship between contact and language change is now generally acknowledged: 
few espouse the traditional view that change follows universal, language-internal 
principles such as simplification, and takes place in the absence of contact with 
other varieties (James Milroy 1998). On the other hand, as suggested above, some 
researchers still downplay the role of CS in change, and contrast it with borrow- 
ing, which is seen as a form of convergence. For example Poplack (1980) argued 
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from data from the Puerto-Rican community in New York that CS represented 
the alternation of two varieties, English and Spanish, which preserved their 
monolingual characteristics. 


3.1 CS and borrowing 


Much discussion in the literature on CS has centered around the relationship 
between borrowing and CS (Romaine 1995; Myers-Scotton 1992). 

The need to clarify this relationship arises because single words — usually com- 
mon nouns — are both the most frequently borrowed and code-switched items 
(Poplack, Sankoff, & Miller 1988: 62). One explanation of this is provided by Bynon 
(1977: 231), who says this just reflects the size of the grammatical categories con- 
cerned. Another is suggested by Aitchison (1994: 62), who points out that nouns 
are freer of syntactic restrictions than other word classes. Both these explanations 
involve “language-internal” factors rather than sociolinguistic ones. A third reason 
for the prevalence of single-word switches/loans is that these are accessible to 
bilinguals with even minimal competence in the donor language. 

In fact there is no failsafe method of distinguishing, at a synchronic level, between 
loans and code-switches. Every loan presumably starts life as spontaneous CS and 
some of these switches then generalize themselves among speakers of the host 
language (Haust 1995; Gardner-Chloros 1995). The diachronic nature of this pro- 
cess is shown by the fact that transfers which occur at different historical stages 
of contact between the languages may go through quite different processes of 
integration and end up looking quite different in the receiving language. Heath 
(1989), for example, has shown how some French verbs borrowed in the early 
French colonial period were adopted in Moroccan Arabic without inflectable 
verb frames. Others borrowed more recently have been instantly provided with 
Moroccan Arabic inflectional frames. Their phonemes have also been imported 
wholesale, and show signs of stabilizing as such (1989: 203). 


Grammatical category 
Nouns may be the most frequently borrowed — and switched — word class owing 
to their grammatically self-contained character, but all grammatical categories are 
potentially transferable (Boeschoten 1998). In certain datasets, it is not single nouns 
which are more frequently transferred. In a comparative study of Punjabi-English 
bilinguals and Greek Cypriot-English bilinguals, whereas single-word switching 
was the commonest type among the Greek Cypriots, intrasentential switching was 
almost three times as frequent as single-word switching among the Punjabis, 
(Cheshire and Gardner-Chloros 1998). The Punjabi speakers switched massively 
more than the Greek Cypriots overall, despite the fact that typologically speak- 
ing, it should be easier to produce complex switches between Greek and English 
than between Punjabi and English. A hypothesis worth testing is that the more 
CS there is overall, the smaller proportion single-word switching represents. 
The following two sections cover criteria which have been held to differentiate 
loans and code-switches. 
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Morphophonemic integration with the surrounding language 

Poplack (1980) argued that transfers which were morphophonemically integrated 
with borrowing language should be classed as loans. Unfortunately, this means 
we cannot use the synchronic versus diachronic criterion for distinguishing 
them, because such integration is often an instant process. Both ad hoc code-switched 
nouns and verbs, as well as more established loans, can take the mor- 
phophonology of the borrowing variety, as examples from many language 
combinations have shown: e.g. French—Alsatian déménagiert (‘moved house’), 
from French déménager (Gardner-Chloros 1991); English-Maori changedngia (‘to 
change’) (Eliasson 1990); English-Spanish coughas (‘you cough’) (Zimman 1993); 
German-English gedropped (‘dropped’) (Eppler 1991); etc. 


Native synonym displacement 

It has been suggested that loans and code-switches can be distinguished on the 
basis that loans are more likely to be filling a “lexical gap” in the host language, 
whereas code-switches tend simply to add themselves as a further option to the 
native equivalent. But this fails to take account of the variety and subtlety of func- 
tions which CS can fulfil. CS too may be a way of “supplementing” the resources 
of the host language. On other occasions it serves quite different conversational 
functions, as is shown in cases where CS is a — more or less exact — repetition of 
what the speaker said in the other variety. Consider the following Punjabi- 
English example: 


(3) RENU... and she was sleeping all over the place, so I had to stay awake 
digdthi-firdthi si everywhere, so I had to stay awake 
‘she was falling around everywhere, so I had to stay awake’ 
(Gardner-Chloros, Charles, & Cheshire, 2000: 1319) 


The near-repetition here of “she was sleeping all over the place” by “digdthi-firdthi 
si everywhere” (‘she was falling around everywhere’) is functional in terms of its 
effect within the discourse, further explaining the need to wake the person up, 
and using one language to qualify what was said in the other (Gardner-Chloros 
et al. 2000). This may be contrasted with the more obvious function of mot juste 
switching (Poplack 1980), where speakers switch without any repetition being 
involved because the other language contains the more accurate term. 


(4) No me precipitaré en el famoso name-dropping 
‘T will not throw myself headlong into the famous practice of name-dropping’ 
(McClure 1998: 134) 


Whether or not this type of CS leads to the word becoming a loan, which fits in 
with the “lexical gap” criterion, does not depend on linguistic factors alone, but 
also on sociocultural ones, as was found, for example, by Treffers-Daller (1994). 
In a study of French—Dutch contact in Brussels, she found that many words 
used by her subjects, such as unique and sympathique, belonged equally to either 


Contact and Code-Switching 197 


variety (i.e. there was no variation between them and native Dutch equivalents 
regardless of the language of the conversation). Furthermore, as both French and 
Dutch have limited morphological marking, the criterion of morphological inte- 
gration could not be used either to determine to which language they ultimately 
“belonged.” Thirdly, both borrowings and code-switches varied as to whether 
they were syntactically integrated or unintegrated. Treffers-Daller therefore 
concluded that a unified theory of borrowing and CS was needed. 

It is fair to add that despite this, in certain corpora, loans and code-switches 
may be distinguishable in some specific way related to their use. For example, in 
Jones (2005), speakers of Jersey French (‘Jérriais’) tended to flag code-switched 
forms (with pauses, self-corrections, etc.), but not borrowings; also, those infor- 
mants with the most positive attitude toward Jérriais avoided CS entirely, though 
they did use borrowings. Thus it was not the linguistic phenomena as such but 
the way in which speakers used them which determined whether “CS” or “loan” 
was the most suitable way of characterizing them. 


3.2. CS and pidginization/creolization 


CS is found alongside pidginization and creolization/decreolization in many parts 
of the world, and contributes to the process of convergence of different varieties. 
Gumperz (1964) and Gumperz and Wilson (1971) identified the close relationship 
between CS and creolization in the early days of CS research; more recently the 
trend has been to emphasize the differences. Though there are differences, it is 
important to note that the processes often co-occur, derive from similar social 
factors, and may even lead to similar outcomes. 

Crowley (1990), for example, argues that in Melanesian Pidgin, access to the 
substratum persists at all levels: stabilization, destabilization, and creolization should 
all, he claims, be regarded as contributing simultaneously to the gradual evolu- 
tion of the language (1990: 384-6). As to CS, its presence or absence seems to be 
largely a function of the prestige attached to the pidgin. Comparing usage on 
the radio in Vanuatu and the Solomon Islands, Crowley finds much more CS in 
the Solomon Islands, where the pidgin’s prestige is low, than in Vanuatu, to such 
an extent that in the case of the Solomon Islands announcers, “It is sometimes 
difficult to know which language they would claim to be speaking.” 

In Papua New Guinea, Romaine (1992) shows how CS occurs within the 
context of a post-creole continuum which has emerged in the preceding 20 years: 
“In town, Standard English, English spoken as a second language with varying 
degrees of fluency, highly anglicized Tok Pisin, more rural Tok Pisin of migrants, 
and the creolized Tok Pisin of the urban-born coexist and loosely reflect the emerg- 
ing social stratification (1992: 323). Whereas some have been adamant that this 
situation should be seen in terms of ongoing CS between Tok Pisin and English 
(Siegel 1994), Romaine believes that there is no principled way for determining 
to which language many utterances belong (1992: 322). 

Le Page and Tabouret-Keller (1985) give examples of creole speech from 
sources in the Caribbean, Belize, and London Jamaican. In such linguistically 
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diffuse situations, CS involves shifting at particular linguistic levels rather than 
a wholesale transition from one variety to another. Discrete codes are hard to come 
by in linguistic contexts as unfocused as those they describe. Sometimes, as in 
the case of London Jamaican, the switching between the codes is more symbolic 
than real: 


London Jamaican is more a set of norms to be aimed at than an internally coher- 
ent and consistent system. Speakers behave as if there were a language called 
“Jamaican”, but often all they do (perhaps all they know how to do) is to make ges- 
tures in the direction of certain tokens associated with Jamaican Creole which have 
a stereotypical value. (Le Page & Tabouret-Keller 1985: 180; see also Sebba 1993). 


A specific ambiguous example is that of compound verb formations, which occur 
in many bilingual settings (Edwards & Gardner-Chloros 2007). Although generally 
considered under the heading of CS, these formations show features of creoliza- 
tion, as they involve grammatical convergence and an analytic approach to 
vocabulary. For example, in English—Punjabi CS, Punjabi verbal “operators” 
meaning ‘do’, ‘make’ are commonly combined, in the speech of bilinguals, with 
a major category (noun, verb, adjective) taken from English, to make new verbal 
compounds which function as a single syntactic/semantic unit, e.g. ple kerna, where 
ple is from English play and kerna means ‘to do/make’ in Punjabi (Romaine 1986). 
The new compound, which means ‘to play’, is synonymous with an existing Punjabi 
verb. Parallel creations have been attested in innumerable, typologically diverse, 
language combinations, including some where, unlike Punjabi, neither language 
provides a native model for the compound formation. For example, in kamno use, 
kamno respect, kamno developed, kamno spelling, the Cypriot form kamno (‘make/do’) 
is combined with various English words to make new verbs. These compounds 
appear superfluous in that the same meaning could in each case be conveyed by 
a single equivalent in Greek (Edwards & Gardner-Chloros 2007). 

As we have seen before, the different types of contact-induced varieties do not 
have clear boundaries (Thomason 1997). Heath (1989) gives examples of French 
borrowings into Arabic and Arabic borrowings into Turkish, which occur in inten- 
sive contact zones with alternative, productive adaptation routines. Along with 
the avoidance of inflections reminiscent of pidgins, some of these importations 
are also unstable with regard to word class, such as himri in Moroccan Arabic 
(from English hungry), which fluctuates between noun and adjective (1989: 202). 
The difference lies in the social function. Whereas pidgins and creoles arise as 
lingua francas, other bilingual mixtures of this type “usually or always serve as 
salient markers of ethnic-group identity” (Thomason 1997: 4). 

Time and again, we find that understanding the role of CS in language contact 
— or indeed its presence or absence in the first place —- depends on grasping the 
interplay of structural and social factors. Treffers-Daller (1994; 1999) compared CS 
between French and Brussels Dutch in Brussels with that between French and the 
Alsatian dialect in Strasbourg. Despite being in different countries, in both cases 
the Germanic language is in contact with French. Both Alsatian and Brussels Dutch 
have borrowed extensively from French in their respective contexts. However in 
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terms of identity the two situations are very different. Whereas a mixed identity, 
relating both to French and Alsatian, is the norm in Strasbourg, in Brussels the 
two linguistic groups are more polarized: almost all the population identify either 
with French or Dutch, but not with both. Consequently, as Treffers-Daller points 
out, Brusselers “no longer consider the mixed code to be an appropriate expres- 
sion of their identity” and intrasentential CS is accordingly infrequent. 


3.3 Intervening variables: a practical example 


The relationship between CS and language contact is both complex and indirect, 
so it is necessary to look at intervening factors if we are to understand the con- 
nections between them. A similar remark can of course be made for other areas 
of interest in sociolinguistics, such as language and gender. In that area, for ex- 
ample, it has increasingly become apparent that differences between women’s and 
men’s speech have to be understood not simply in terms of gender roles as such 
but in relation to other factors connected with those roles, such as the power 
relationship between the speakers and the conventions governing behavior in 
the community. “We must criticize explanations of difference that treat gender 
as something obvious, static and monolithic, ignoring the forces that shape it and 
the varied forms they take in different times and places” (Cameron 1992: 40). The 
example discussed here illustrates the role of some intervening factors in CS and 
in gender at the same time. 

Like other second-/third-generation migrants, London Greek Cypriots use a 
code-switched mode among themselves as a mark of ingroup identity (Gardner- 
Chloros, McEntee-Atalianis, & Finnis 2005; Gardner-Chloros & Finnis 2004). At 
first glance, gender did not appear to be a major differentiator in relation to CS: 
the overall quantity of CS employed by young second-generation men and 
women was roughly similar. On closer examination, however, it appeared that 
each gender used CS for rather different purposes. 

These purposes can best be understood with reference to the roles which 
characterize women and men in this community on the one hand, and to linguistic 
politeness on the other (Brown & Levinson 1987). Switching from English to the 
Greek Cypriot Dialect can be indicative of positive politeness (PP), as in the use 
of diminutives or terms of endearment or expressions of sympathy or interest. In 
the example below, the first (female) speaker (F1) is complaining that her parents 
have started agitating for her to have an arranged marriage. A male friend (M1) 
asks for clarification, but when her female friend (F2) also asks her to specify 
further as to which parent is involved, she switches to the Greek Cypriot Dialect, 
thereby indicating (female) solidarity: 


(5) 1 Fl Am the only person that gets???* by their parents already? 
2 M1 What, about getting married? 
3 Fl Yeah, she started today. 
4 F2  ??? wava cov? 


2??? mana sou”? 
‘your mother?’ 
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Secondly, CS was used in the community for negative politeness (NP). As 
others have remarked, CS can provide a “double voice” which, like some indi- 
rect reported speech, blurs the ultimate authorship of the remark (Stroud 1998). 
Whereas a direct request imposes an immediate obligation on the hearer to 
respond, CS acts to distance the request and thereby allows the addressee more 
leeway as to how he/she acts upon the utterance. 


(6) 1. M1 Hello! 
2. Fl Nat, ovyvapun! Avte tedeiove. Ev va Byovpe e&o! 
Ne, siynomi! Ade telione. In na vyume ekso! 
‘Yes, sorry! Come on, hurry up and finish. We are (due) to go out!’ 


In this example, speaker M1 is trying to get speaker F1’s attention, as she is being 
noisy. She responds to him in Greek. Although she apologizes, she sounds quite 
abrupt, but her switch to another variety than the one used most of the time at 
the meeting allows her to get away with this abruptness, and introduces a lighter 
note. Being “indirectly direct” seems here to be a particularly female strategy. 

Thirdly, speakers of both sexes switched to Greek Cypriot for humor. This 
type of switching acted as a legitimating device for women in particular: as has 
often been pointed out, women are less inclined than men to indulge in overt 
humor. Kaplan, for example, observes that “In many cultures, there is a strong 
taboo against women telling jokes. If we think of jokes as the derepressed sym- 
bolic discourse of common speech, we can see why jokes, particularly obscene 
ones, are rarely spoken from the perspective of femininity” (1998: 58). Example 
(7) is taken from a youth meeting. The speakers are talking about accessing a color 
laser printer in order to print out a substantial number of flyers to distribute 
to their members. The (female) speaker (F1) switches to Greek to insert playful 
discourse into the interaction. 


(7) M1... ??? happen to know anyone that has like a colour laser jet... 
Fl I know a place where they do??? 
M1 yeah 
Fl??? 


M1 what make are they? 
Fl Ev né€pa, ev AEmtopEepEtes 
en iksero, en leptomeries 
‘I don’t know, these are details’ (general laughter) 


This is also an instance of “double-voicing” as the speaker adopts the voice or 
persona of a particular Greek stereotype, that of a laid-back type who is un- 
concerned with practical details. In this way, she justifies her ignorance of the 
technical details of the photocopier and at the same time causes laughter through 
her performance. 

A fourth function of CS which turned out to be particularly associated with 
women was its use in attenuating the directness of orders or requests. The use of 
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a different variety allows the speaker to “shift authorship,” and distance herself 
from her directness: she is carrying out a face-threatening act (an order) but the 
language switch softens it in several ways. Below, speaker F1, after asking the 
same question in English twice and failing to get a response from speaker M1, 
switches to Greek to elicit a response. Having succeeded in doing so, she then 
switches back to English. 


(8) M1 All right 
Fl Stop, how many days is the conference? 
M1 Guys, I wanna finish at seven o'clock 
Fl I’m asking ! How many days is the conference? 
M1??? It’s half past six. 
Fl = Kupteé Mevixo, nooéc nuepés Etvan? 
kirie Meniko, poses imeres ine? 
‘Mr Meniko, how many days is it?’ 
M1 It will be around four days, I imagine 
Fl Ok, four days, good... and what time? 


The potentially face-threatening act — a culmination of repeated questions which 
had already been phrased pretty directly — is carried off thanks to the switch to 
Greek, which not only allows greater directness but is also the “we-code” and 
evokes a shared identity or closeness between the speakers. 

All in all, CS strategies in this setting appeared to be particularly useful for women 
in getting round some of the traditional constraints on female discourse, such as 
the expectation that it will be less forceful, pressing, or direct than that of men, 
or that making jokes is unfeminine. In other forms of humor, e.g. the use of coarse- 
ness and references to peasantness, CS appeared to be more of a male strategy. 
Women also used CS for solidarity in certain contexts which were directly rele- 
vant to them, e.g. in talking about mothers and their attitudes toward their 
daughter’s marital status. 

The conversational tactics associated with CS are used by both genders; and 
both sexes use CS in similar ways to carry out acts of PP and NP. Our observa- 
tions suggest however that women in this community make particular use of CS 
as a softening device to carry out certain direct speech acts, which require NP 
and PP strategies to attenuate their directness. The directness in itself seems to 
be a way for women to hold their own when they are interacting with men. Thus 
they can both stand up to the opposite sex through their forthright repartees, and 
avoid sounding overbearing thanks to the humorous undertones brought in by 
CS, or the overtones of solidarity associated with using Greek Cypriot dialect, the 
shared we-code. They are further aided by the greater acceptability of direct- 
ness in Greek culture. As this community does still support clear gender roles, 
and the convention is for women to be less direct and assertive than men, it was 
understandable that women were exploiting the possibilities offered by CS in this 
particular way. To the extent that one can show that gender differences are con- 
tingent upon culturally determined norms, the role of gender as such is relativized. 
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4 Conclusion 


CS is one of the possible outcomes of contact between two (or more) varieties, and 
often coexists and overlaps with other outcomes. Owing to the range of linguistic 
guises which it adopts, it has sometimes been ignored altogether in studies of 
language contact, and is sometimes defined as being more neat and tidy than 
it actually is. There are examples of relatively stable bilingual situations where 
two varieties apparently alternate without affecting one another’s essential char- 
acter, but in other cases CS is rule-breaking behavior, which should be seen not 
in relation to static norms but in terms of language change and convergence. A 
fuller exploration of the mechanisms by which the latter occurs is still needed (see 
the papers in Bilingualism, Language and Cognition 7.2, 2004). Otheguy, for example, 
claims that that convergence occurs at the semantic—pragmatic interface, with bilin- 
guals selecting “the most parsimonious grammar that serves both languages” (2004: 
167). Bullock and Toribio (2004) distinguish it from interference and transfer; rather 
than implying the imposition of a structural property from one language on another, 
they see it as an “enhancement of inherent structural similarities found between 
two linguistic systems” (2004: 91). It is to be hoped that the relation of CS to these 
processes will continue to be investigated. 

Above all, it increasingly appears that sociolinguistic factors are the key to under- 
standing why CS takes the form it does in each individual case. Such factors affect 
different subgroups in different ways, which explains why there are often dif- 
ferent types of CS within the same community. At a social level, CS may be seen 
as the product of a power struggle between two varieties (Pujolar 2001). At an 
individual level, it reflects varying bilingual competences and serves as a discourse- 
structuring device. Like other outcomes of language contact discussed in this 
volume, CS is not a passive victim of linguistic forces. 


NOTES 


This chapter draws on material from Code-Switching by the same author (Cambridge 

University Press, 2009). The permission of Cambridge University Press to reprint extracts 

from the book is acknowledged with thanks. Section 4 uses material which first appeared 

in Gardner-Chloros and Finnis (2004). 

1 Several of these instances are illustrated in Gardner-Chloros (2009). 

2 Relexification is the process by which a language which retains its basic grammatical 
identity replaces part or all of its lexical stock with words from another variety. 

3 My translation. 

4 “222” indicates an unclear passage in the recording. 

5 Greek has been transcribed for the purposes of these examples into a semi-phonetic 
Roman script, retaining the Greek letters /y/, /5/, and /y/ which sound the same as 
those in the IPA (International Phonetic Alphabet). Other sounds follow English 
spelling: /8/ is represented as “th” and /f/ as “sh.” The phonetic symbol /j/ is left 
as an i (e.g. [ja] = ia) so as to avoid confusion with the English letter j. 
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10 Contact and Dialectology 


DAVID BRITAIN 


1 Introduction 


This chapter examines the outcomes of routine dialect contact. Whilst some of the 
more extreme forms of contact — for example, those triggered by colonization, forced 
labor movements, and other forms of long-distance mass migration (see Kerswill, 
this volume) — tend to result in rather significant and dramatic linguistic changes, 
I concentrate here on the consequences of interdialect contact that occurs at a more 
local level as members of speech communities go about their everyday lives. It 
will become evident that the sorts of change that occur at this local level are indeed 
typologically very similar to, if less dramatic than, those which research has shown 
to take place in contexts of radical and extreme contact. That we tend to treat them 
differently is partly, I would argue, because, until recently, dialectology had not 
fully embraced the fact that, as they go about their routine, mundane, day-to-day 
business people move and they often do so for the purposes of interaction, inter- 
action which brings them into contact with people who necessarily will speak (often 
subtly, often not) different language varieties. Linguists have only relatively 
recently begun to pay heed to this general shuffling around, but it is important 
shuffling in terms of who we talk to, where, and in what contexts. Dialectologists 
have indeed become more acutely aware of the need to build mobility into their 
theoretical models (e.g. Trudgill 1986), but this has, until fairly recently, gener- 
ally focused upon fairly long-term, long-distance, permanent mobilities rather than 
those of the taken-for-granted everyday kind. Traditionally, dialectology has 
kept people still (see also Britain, in press b), and has to some extent been guilty 
of what the anthropologist Liisa Malkki has called a “sedentarist metaphysics” 
(Malkki 1992; see also Cresswell 2006). 

I begin by arguing that linguistic accommodation is key to our understanding 
of the outcomes of everyday contact, and that routinization and regularity of 
contact, across time, help turn accommodation into acquisition. I then survey the 
evidence suggesting that linguistic innovation diffusion should be conceptualized 
as dialect contact, before examining the outcomes of contact induced by everyday 
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kinds of personal mobility: moving to the city to work, moving to the country to 
escape the hustle and bustle, moving to climb the housing ladder, moving to buy 
the ingredients for tonight’s dinner. These outcomes include dialect supralocal- 
ization, leveling, and various kinds of linguistic hybridity, such as reallocation 
and interdialect formation. The social and geographical intersections of people’s 
daily life paths create contexts where accommodatory linguistic behavior can 
occur, but I then examine contexts where, for whatever reason, contact is — or has 
been — limited, resulting in dialect boundaries. The survey of contact-induced dialec- 
tological change ends with a discussion of examples of contact which have, 
unusually, led to divergence rather than the expected convergence. The chapter 
concludes with a case study of an area of eastern England which exemplifies many 
of these contact outcomes — the Fens. As marshland it was a barrier to commu- 
nication, but more recently, following drainage and reclamation, the area has 
witnessed local dialect contact, though in the continuing context of its position at 
the peripheries of both the eastern and midland regions — in other words contact, 
but across a persisting physical, social, and perceptual barrier to communication. 


2 Contact and Accommodation 


Contact-based explanations for language change rest squarely on the basis of 
linguistic convergence in face-to-face interaction. Trudgill, for example, places post- 
adolescent linguistic accommodation at the root of changes that lead to koineiza- 
tion in contexts of dialect contact (see also Kerswill 2002; this volume; Milroy 2002; 
Trudgill 2004). It is argued that, in contexts of dialect contact, linguistic accom- 
modation to speakers of other varieties becomes routine, and the variants that 
emerge as a result of accommodatory behavior gradually stabilize and become 
more durable characteristics of that person’s linguistic repertoire. Such stabiliza- 
tion is assumed, also, to occur more readily in social contexts where people have 
developed strong networks in the post-contact community (Milroy 1980; Britain 
1997). Contact arguments rely, then, on short-term accommodation fossilizing into 
long-term acquisition. 

Perhaps surprisingly, there has been relatively little work examining the short- 
term phonological, grammatical, and other structural accommodation that is the 
prerequisite to longer term contact. Probably the most well-known studies are 
Coupland’s (1984) study of a Cardiff travel agent’s accommodation to her clients, 
and work within the audience design paradigm of linguistic style established by 
Bell (e.g. Bell 1984; 2001; Bell & Johnson 1997; Rickford & McKnox-Nair 1994). 
These studies all showed that accommodation was not categorical or complete, 
varied according to the nature of the audience, and, for the most part, is not as 
broad in scope as the variation existing in the community as a whole. 

The question of which forms are accommodated fo is a rather controversial one. 
Just as in contexts of koineization (Trudgill 1986; Kerswill 2002; this volume), lin- 
guistic variants which are well represented in the input dialects, variants which 
are “unmarked” — linguistically, socially, or regionally — tend, on the whole, to be 
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very good candidates for adoption, but the roles both of perceptual salience (Trudgill 
1986; Kerswill & Williams 2002) and social factors such as local identity (Trudgill 
2008; Schneider 2008; Holmes & Kerswill 2008, etc.) are hotly contested. Some- 
times none of the input forms survive intact. The imprecision of accommodation 
is argued to be responsible for the emergence of new forms as a result of dialect 
contact that were not found in any of the varieties that formed part of the contact 
mélée. Accommodation can therefore lead to linguistically intermediate forms devel- 
oping — interdialect — or forms which result from the social or linguistic reassignment 
of functions in the new variety — reallocation. 

At the more local level under discussion in this chapter, while the consequences 
are not as extreme as in the sorts of dramatic dialect contact scenarios researched 
by Barz and Siegel (1988), Trudgill (1986), Kerswill and Williams (2000) and so 
on, mobility still provides the trigger for intervarietal accommodation as we come 
into contact with speakers of other varieties at work, on the train, when dropping 
the children off at school, in the post office, and at the coffee morning, in other words 
utterly routinely as part of our day-to-day lives. At one level, of course, we accom- 
modate whenever we speak to anyone else, but the routinization of mobilities 
beyond the home and the neighborhood, characteristic of late modernity, mean 
that even mundane everyday contacts expose people in some parts of the world 
to a wide range of varieties, and regular accommodation (and its consequences) 
to markedly different variants becomes the norm. 


3. Contact and Diffusion 


Communities witnessing the arrival of linguistic innovations from outside rep- 
resent one context in which different dialect forms — here the innovation, on the 
one hand, and the conservative form under threat from that innovation, on the 
other — come into contact. The literature tends to concentrate, in these cases, on 
the speed and quantitative penetration of the spreading innovation as well as on 
the geographical distributions of the diffusing forms. Some features seem to spread 
in a wave-like manner from the central point of origin of the innovation, others 
seem to diffuse down an urban hierarchy of large city, to smaller city to town, 
village, and country (which consequently does not appear wave-like in physical 
space), and still others spread from rural areas to more urban ones, so-called coun- 
terhierarchical changes (see Britain in press a, 2010b, for a review of diffusion 
models). For the most part, the diffusion literature has reported cases where the 
innovation has obliterated the conservative form leaving no trace. This is the case, 
for example, in studies of the spread of TH fronting (the replacement of /@/ and 
noninitial /6/ with [f] and [v] respectively) (see Kerswill 2003 for an excellent 
holistic account of this feature). Much less well reported, however, are: 


e The cases of linguistic hybridity which have resulted from the contact 
between advancing innovations and traditional features within a community. 
Trudgill and Foxcroft (1978) examined the consequences of the gradual 


Contact and Dialectology 211 


northward advance of the merger on [au] of ME /ou/ (as in ‘knows’ and 
‘thrown’) and ME /9:/ (as in ‘nose’ and ‘throne’) in East Anglia in England, 
where these forms were traditionally realized as [Au] and [vu] respectively. 
They show that for some speakers the merger proceeds by transfer, with the 
nonmerged form gradually being replaced by the merged form. For others, 
however, especially middle-class speakers, a hybrid form has emerged as a 
consequence of the contact between nonmerged and merged forms, so that 
words derived from both ME /ou/ and ME /9:/ are realized as an interdialectal, 
intermediate [ou] (Trudgill & Foxcroft 1978: 73). Hereby, therefore, the arrival 
of an innovation results in the creation of an entirely novel form, unlike either 
the new or old variants, and leads to an overall expansion of the number 
of variants in the dialectological landscape, rather than a (perhaps expected) 
reduction. 

¢ Cases where the social evaluation of the innovation changes en route as a result 
of the existence of different sociolinguistic contextualities in the communities 
receiving the innovation from those that generated the innovation in the first 
place. For example, the glottalling of /t/ in London English — a stereotypic- 
ally “stigmatized” feature, most commonly used by working-class speakers — 
has diffused to the Welsh capital of Cardiff, where it is used mostly by 
young “middle-middle-class” speakers (Mees & Collins 1999). Consequently 
an innovation often evaluated as “sloppy” where it began, ends up, many miles 
away, being seen as a sign of urban chic. As Mees and Collins argue, “glot- 
talisation represents ...a move away from local Welsh accent characteristics 
towards more sophisticated and fashionable speech .. . [it] is associated with 
London life, metropolitan fashions and trend-setting attitudes; it can be heard 
from royalty as well as rock stars” (1999: 201). 

e The cases of innovative diffusing features themselves changing as they 
diffuse. 


Recent work by Labov (2007) has taken a close look at the sorts of changes dif- 
fusing features can undergo. Drawing upon his extensive work on the tensing 
and raising of short /a/ in the USA, he shows how the system of raising found 
in New York, with a distribution of tense and lax /a/ determined by complex 
phonological, grammatical, lexical, and stylistic constraints, has been simplified 
as it has been diffused to Albany, Cincinnati, and New Orleans. Table 10.1 shows 
the simplifications that have resulted from diffusion to these three cities. The tens- 
ing of /a/ continues apace, but some of the more marked constraints on tensing 
have been lost in the diffusion process. 

These simplifications contrast with the small but perceptible increments in changes 
that take place in relatively stable communities where parent to child transmis- 
sion of changes is dominant. Labov concludes that “contact across communities 
involves learning, primarily by adults, who acquire the new variants of the ori- 
ginating community in a somewhat diluted form... Adult learning is not only 
slower, but it is also relatively coarse: it loses much of the fine structure of the 
linguistic system being transmitted” (2007: 320). Labov also notes that “these 
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Table 10.1 The consequences of the linguistic diffusion of the tensing and 
raising of /a/ from New York City to Albany, Cincinnati, and New Orleans 
(based on Labov 2007) 


Tensing contexts New Albany Cincinnati New Orleans 
York 
Before: /bdmngfs@O@d3f/ Vv v Not before Also before 


/g/, but also /v z/ 
before /v z/ 


Not in function words v x x x 
with simple codas 


Not in open syllables without V x x Vv (but weakened 

a morpheme boundary among younger 
speakers) 

Not word-initially, exceptin V v v Not in after 


frequent words (e.g. ask, after) 


Lexical exceptions v ? ? ? 
(e.g. avenue) 


Other conditions (e.g. notin V¥ ? ? 2 
abbreviations, acronyms, 
learned words) 


contact phenomena share the common marks of adult language learning: the loss 
of linguistic configurations that are reliably transmitted only by the child language 
learner” (2007: 382). The idea that imperfect adult learning can be seen as one of 
the major causes of contact-induced change is one which has been extensively exem- 
plified over a long period by Trudgill’s work on dialect contact and isolation (see, 
for example, 1986; 1992; 1996; 2002). 


4 Contact and Mobility 


We talked earlier about the mobile nature of our everyday lives, a mobility that 
tends to be taken for granted and the dialectological consequences of which are 
relatively unexplored. The mobility we engage in at an everyday level is often 
highly routine — going to work, taking the kids to school, shopping, going to the 
same sandwich bar to pick up lunch, etc., and is especially routine for some groups 
(e.g. children) more than others. Through our mobility practices individuals, 
families, and communities build up routinized spatial patterns. Routines ensure 
that some of these “paths” are (often extremely) well worn, others much less so. 
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At a higher scale, it is through the practice of routines, on a mass scale and over 
time, that “places” and “regions” emerge, as the result of the intersections and 
bundlings of people’s life paths (which, in turn, shape and are shaped by 
people’s attitudes to and ideologies concerning spaces and places). As Johnston 
makes clear, in going about their everyday spatialities, individuals navigate a medi- 
ated path between structure and agency as neither “ ‘bearers of structure’ with 
behaviour determined by forces over which [they] have little or no control,” nor 
“identified as operating unconstrained free will” johnston 1991: 51). People are 
free to move, but do so constrained by past practices and routines and by eco- 
nomic and social institutions. Viewing places and regions in this way emphasizes 
that they are shaped by practice, are processes rather than objects, have rather 
imprecise geographical scales, and show internal heterogeneity. So, for example, 
Allen, Massey, and Cochrane’s (1998) economic geographical research on the 
South East of England has highlighted, through the investigation of a number 
of measures such as income, salary increases, house price growth, changes in 
employment levels in different economic sectors, government public spending 
increases, and so on, the fluid and diverse nature of the South East as a “regional” 
process, one created, shaped, and reworked by practice, both personal and insti- 
tutional, within it. We produce places and regions, but they in turn provide the 
context — enabling as well as constraining — for that production. Not only are we 
therefore bound to some extent by what our predecessors have done in creating 
place and region, these places and regions can change if practice within them 
changes (see, further, Britain in press b). 

It is not controversial to argue that mobility practices have changed somewhat 
in the past century in many Western societies because of increasing urbanization 
and counter-urbanization; increased migration and immigration; increased labor 
mobility and flexibility; improvements in transportation technologies; a shift 
from primary and secondary to tertiary sector employment as the backbone of 
the economy; an increase in mobile and flexible working facilitated by transportation 
developments, the internet, and employment legislation, an expansion in higher 
levels of education (in places often well away from the local speech community), 
the normalization of long(er)-distance commuting, labor market flexibility; geo- 
graphical reorientations of consumption behaviors (toward large out-of-town 
malls and hypermarkets, entertainment complexes, etc.); and increasing geo- 
graphical elasticity of family ties. And this mobility has had sociolinguistic 
consequences. 

Since regions are one of the consequences that result from these mobile practices, 
one of the most obvious dialectological outcomes of intraregional movement is a 
change to the geographical scale of local dialect forms. As people’s lives routinely 
take them further and wider as they go about their daily business, so the cur- 
rency of dialect forms they use expands in scale too. Consequently highly local 
dialect forms have begun to be eroded, leveled away in favor of forms fulfilling 
the need for greater geographical scope. This process has been labeled “supra- 
localization” (Milroy, Milroy, & Hartley 1994), “regional dialect leveling” (Kerswill 
2003), or “supraregionalization” (Hickey 2003). 
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Since mobility has always been a characteristic of human life, such supralocal- 
ization is not new (see Britain 2009, for example), but the scale of it now may well 
be unprecedented. What has to be made clear though is that regions neverthe- 
less retain a good deal of internal socioeconomic and sociogeographical diversity, 
and the process of supralocalization is one which, because it is advancing on a 
socioeconomically and geolinguistically differentiated landscape, will itself result 
in uneven dialect patterns. 

Vandekerckhove (e.g. 2005a, 2005b, 2009) has shown how mobility has been 
triggering regional convergence of the Dutch dialects of West Flanders in 
Belgium. West Flemish is a particularly interesting case because until very 
recently “local dialect has preserved, to a large extent, its monopoly position as 
informal group language” (Vandekerckhove 2005a: 111). She finds, however, that 
supralocal contact is leading to convergence between varieties in the region. 
Her analyses, investigating in particular the strong forms of subject, object, and 
possessive plural pronouns, variants of so-called “scherplange 6” (in words such 
as boot (= English ‘boat’)), and diminutive suffixes in four urban centres of West 
Flanders — Bruges, Poperinge, Roeselare, and Kortrijk — show that all are under- 
going leveling, not toward Standard Dutch, but to an ever-converging West-Flemish 
regiolect. That regional dialect, for some of the variables studied, also shows some 
influence from the supraregional colloquial Flemish variety that has come to be 
known as “tussentaal” (in English: ‘intermediate language’; Vandekerckhove 
2005b: 549), demonstrating a complex web of interacting geographical scales 
having important local dialectological consequences. For example, in her work on 
diminutive suffixes, data from studies in the late nineteenth and early twentieth 
centuries show different diminutive forms for ‘man’ in each of the four urban 
centres, but these have been leveled down to two very similar forms in the late 
twentieth century (see Table 10.2 below). 

Further contexts in which supralocalization is in progress also show examples 
of other typical (Kerswill, this volume) contact phenomena, such as leveling (the 
eradication of forms that are either minority and/or stigmatized and/or marked 
in the dialect mix in favor of majority, unmarked, neutral forms), reallocation 
(the survival of at least two forms in the dialect mix to serve new linguistic or 
sociolinguistic functions), and simplification (a relative reduction of redundancy 
and irregularity in a linguistic system). Amos (2007: 64), for example, reports 
the gradual leveling away of /t/ flapping in the — for the South East of England 


Table 10.2 Changes in the diminutive of ‘man’ in four urban West Flanders 
dialects of Dutch (Vandekerckhove 2005b: 546) 


Bruges Poperinge Roeselare Kortrijk 


Older data maneat.ji manatfa manoxi manja 
Recent data manatfa manatfa manoaka manoaka 
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Table 10.3 The regular and potential negative morpheme systems in Kyoto, 
Osaka, and Takatsuki, a suburb in the transition zone between Kyoto and 
Osaka (Long 2001) 


Dialect Negative Potential negative 

(i.e. ‘do not’) (i.e. ‘cannot’) 
Kyoto -ahen -ehen 

(e.g. kakahen — ‘I do not write’) — (e.g. kakehen — ‘I cannot write’) 
Osaka -ehen -arehen 

(e.g. kakehen — ‘I do not write’) (e.g. kakarehen — ‘I cannot write’) 
Takatsuki -ahen -arehen 


(e.g. kakahen — ‘I do not write’) —_ (e.g. kakarehen — ‘I cannot write’) 


- relatively isolated community of Mersea Island in Essex, a decrease from 21 
percent of all tokens among older informants to just 7 percent among younger 
speakers (2007: 64), with a concurrent increase in glottalling (from 46 percent to 
70 precent, though note that even among the old, glottalling was already the 
dominant form). Long (2001) provides convincing evidence of contact-induced 
reallocation in varieties of Japanese spoken in the suburbs between the large 
cities of Kyoto and Osaka. Table 10.3 shows that intermediate suburbs such as 
Takatsuki fuse the Kyoto and Osaka systems of the regular and potential nega- 
tive morphemes — selecting the Kyoto regular negative and the Osaka potential 
form — but do so in a way that preserves the distinction between the two nega- 
tion types, which would have been lost had the orientation of the reallocation been 
reversed. 

The infamous case of south-east England’s “Estuary English” — essentially a 
gradually emerging convergent regional dialect — highlights not only some of the 
linguistic consequences of supralocalization, but also that this process is still in 
progress and has advanced at different rates across the region as the mobilities 
which have created the contact, too, have occurred at different and uneven rates 
and intensities. Przedlacka (2002), for example, shows that while four of the 
counties surrounding London appear to be undergoing similar sorts of changes, 
they are doing so to different degrees, at different rates, for different variables. 
In fact half of the variables she examined in young speakers showed statistically 
significant differences between different counties, showing that while convergence 
may be under way, it is by no means complete or evenly distributed. Przedlacka 
claimed, consequently, that “the extent of geographical variation alone allows us 
to conclude that we are dealing with a number of distinct accents, not a single 
and definable variety” (2002: 97). Furthermore, for some variables she found no 
change across real time at all — for example, teenagers in her data had levels of 
glottal stop use for /t/ that matched speakers in the Survey of English Dialects born 
a century before them. 
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5 Lack of Contact: Boundaries 


Just as regions are formed through the mediation of the mobile practices of people 
within them and social, economic, and institutional pressures from above, 
boundaries between regions form where the density of contacts across the land- 
scape is much less intense, where interaction has, for some reason — physical, social, 
economic, or institutional — been weaker. Boundaries, then, may well be created, 
shaped, and reworked by the Jack of contact and mobility across the divide that 
separates two or more regions. This means, first of all, that, in the right circum- 
stances, a boundary can move or disappear and, secondly, that given a historical 
entrenchment of a boundary in the mobilities of generations of speakers, a 
boundary may persist despite the fact that its probable original cause has long 
disappeared. 

One important boundary at which we can explore the effects of mobility (or 
the lack of it) is that between the Netherlands and Germany, a political bound- 
ary often cited as one crossed by a dialect continuum, where West Germanic dialects 
on either side of the boundary are, or at least were, very similar to each other 
(Chambers and Trudgill 1998: 5; Trudgill 2000: 3). Two studies, both considering 
different parts of the boundary, exemplify how changes in political and institu- 
tional status can dialectologically sharpen once rather permeable boundaries. 

De Vriend et al. (2008) examined the Dutch—German border in the Kleveland 
area south of where the Rhine crosses into the Netherlands. The area covers a 
historical dialect continuum that straddles what only in 1830 became a stable national 
boundary between the two countries. As time passed, of course, people’s every- 
day lives on either side of the boundary gradually became more and more shaped 
by institutional pressures and jurisdictions. Firstly, political and economic prob- 
lems often caused there to be crossing restrictions (Giesbers 2008: 66-7). Dutch 
children (from 1901) went to school in the Netherlands (in which Standard 
Dutch was the prescribed norm) and German children went to German schools 
(in which High German was the norm). The local dialects, therefore, although 
similar at the informal vernacular level, looked to divergent standard forms 
depending on which side of the border their speakers were living. Later, when 
the broadcast media developed, dialect speakers on the Dutch side received TV 
and radio in (standard) Dutch, and on the German side in (standard) German. 
The researchers also showed that spatial behaviors began to change too — over 
time, there were, for example, fewer and fewer cross-border marriages (Giesbers 
2008: 63), fewer people had friends on the other side of the border, fewer went 
shopping in the other country, and so on. Other institutional factors also played 
a role — few local public transportation routes now cross the border (Van Hout, 
p.c.). The researchers examined the extent to which the political boundary was 
also becoming a linguistic boundary as well as the degree to which linguistic 
behavior correlated with perceptions and spatial behavior on either side of the 
political boundary. Results following a battery of data collection and perceptual 
experimental tasks showed that “Location pairs across the Dutch-German border 
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are nowadays separated by a clear linguistic gap” (de Vriend et al. 2008: 129), 
and this finding was robustly supported by perceptual evidence too — people very 
strongly viewed the political boundary as a dialect boundary too. And, not 
surprisingly, the linguistic differences also correlated strongly with people’s 
(perceived — actual behavior was not examined) sociogeographical spaces — 
where their friends and family lived and where they tended to go shopping. 
De Vriend et al. argued therefore that “the linguistic distance is not a property 
on its own, but is embedded in the social structure of the research area” (de Vriend 
et al. 2008: 131). 

Further south, Gerritsen (1999) investigated the dialects of three settlements, 
all within 15 kilometers of each other, all of which used to speak the same Western 
Germanic dialect, but which are situated this time in three different countries — 
Belgium, the Netherlands, and Germany. Especially since the beginning of the 
twentieth century, these small settlements have of course increasingly come 
under the differing jurisdictions and regulatory frameworks of one of these three 
countries. She looked at lexical (such as the word for ‘onion’ (un in the original 
dialect of the area, ui in Standard Netherlands Dutch, wi or ajuin in Standard Belgian 
Dutch, and Zwiebel in Standard German)), phonological (e.g. the vowel of the word 
for ‘house’ ([u] in the original dialect, [cey] in Standard Dutch, and [au] in 
Standard German)), as well as morphological features (e.g. plural formation on 
nouns). She found that, comparing middle-aged with young women, all were 
moving toward their respective standard forms, especially those in Germany, where 
the standard is more linguistically distinct from the original local dialect than 
Standard Dutch varieties of either Belgium or the Netherlands. She concludes that 
“political factors can have a strong effect on dialect change” (1999: 63). 

Both these examples show that boundaries can emerge as life paths and the 
perceptual factors which are shaped by and shape those paths are steered by 
institutional and structural constraints. The boundaries are still physically very 
permeable, but routinized paths tend not to lead people to cross them. As we will 
see, such patterns can emerge at the sub-national level too. 


6 Contact-Induced Divergence 


We saw earlier how contact doesn’t always so straightforwardly lead to the level- 
ing of local dialect features, because it doesn’t take place across geolinguistically 
homogeneous landscapes and because it can lead to the emergence of hybrid forms 
not found in the input varieties. On occasions, though, contact can lead to diver- 
gence, as communities react against incoming innovations.’ One example of such 
an “anti-innovation” divergent reaction was highlighted in the Basic Materials of 
the Survey of English Dialects (SED, Orton 1962-71) on the rhotic side of the 
rhotic/nonrhotic dialect boundary along the Welsh—English border. In England, 
unlike in the US, nonrhoticity is advancing and contact in this area with advan- 
cing nonrhoticism has led to the emergence of epenthetic rhotic forms in lexical 
items with no etymological <r>: the word last, for example, in this area, was often 
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pronounced as [la’st]. Trudgill (1986: 75) claims that this epenthetic /r/ is the result 
of a reaction against innovation: “the r-ful pronunciation... becomes a local 
dialect symbol, and the use of that pronunciation a way of indicating dialect and 
local loyalty” (see also Britain 2009; and, for examples from the Iberian peninsula, 
Penny 2000: 53-4). 

A similar pattern of contact-induced divergence can be found in the relatively 
isolated community of Smith Island, in the Chesapeake Bay of Maryland, USA, 
following work by Walt Wolfram, Natalie Schilling-Estes, and Jeff Parrott (see, 
for example, Schilling-Estes & Wolfram 1999; Parrott 2002; Wolfram 2002). The 
islands have long been relatively isolated and ever more speakers have been 
moving away to seek better employment prospects. Unlike on Ocracoke (see, for 
example, Wolfram & Schilling-Estes 1995), another insular community further south, 
Smith Island has not attracted many mainland newcomers, and consequently the 
population has been declining, almost halving over the past 30 years (Parrott 2002: 
3), leading to a concentration of the dialect in the mouths of the few that remain. 
This has led to the use of the island’s distinctive dialect characteristics increasing 
rather than decreasing (see Wolfram 2002: 770): these include a back and raised 
nucleus of /ar/: [a'], a front gliding realization of /au/: [e'], and a tendency to 
level past BE to weren't, regardless of person and number (all three of which are 
shared with Ocracoke), along with the use of “weak expletive it” (e.g. it’s a dance 
tonight). Consequently, Smith Island English is diverging dialectologically from 
the nearby mainland. For all four variables, the researchers show that it is the 
islanders born in the 1950s and 1960s, the first group to experience population 
decline as adults on Smith Island, who are the ones leading the surge in use of 
the divergent forms. This group was also the first to experience regular contact 
with mainlanders as they were the first to attend high school on the mainland 
rather than on the island. Similar to the arguments put forward by Trudgill to 
account for hyperdialectal /r/ in western England, Schilling-Estes and Wolfram 
(1999: 510) argue that such divergence may occur because speakers “seek (consciously 
or unconsciously) to heighten their already increasing dialectal distinctiveness as 
a sort of linguistic ‘self-defence’ against the encroachment of the outside world.” 


7 The Fens: A Case Study in Contact Dialectology 


To conclude, a case study from a rural British community in which contact, of 
different kinds and at different periods in time, has been central to the genesis of 
its present-day dialect. The Fens (see Figure 10.1) are a low-lying area of eastern 
England situated about 85 miles (135 km) directly north of London, and 50 miles 
(80 km) west of Norwich. Compared with the rest of south-east England, it is 
a relatively sparsely populated region, many parts of which have a population 
density less than a fifth of the English average. The area has a rather interesting 
geomorphological history. Before the seventeenth century the area was largely 
boggy marshland with sparsely distributed communities settled on small patches 
of higher ground which were often themselves subjected to regular flooding; because 
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Figure 10.1 The location of the Fens in eastern England 


of their impenetrability, the area served as a regional and military frontier 
between east and west (Darby 1931). The geographical and political boundaries 
and the perceptions of the Fens and Fenlanders engendered a strongly negative 
reaction to the area. Darby (1931: 61) claims that there arose “a mythical fear of 
a land inhabited by demons and dragons, ogres and werewolves.” The famous 
diarist Samuel Pepys in an entry on September 18, 1663, describes his travels “over 
most sad fenns, all the way observing the sad life which the people of the place 
do live, sometimes rowing from one spot to another and then wading.” The phys- 
ical impenetrability of the Fens to outsiders, the concentration of sociopolitical 
spheres of influence to the east and west, and the almost demonic external per- 
ception of the area and its inhabitants led to the Fens becoming seen as a major 
boundary between two important and economically powerful regions. As we will 
see, these social, political, and psychological barriers have led to the development 
of linguistic ones which have survived to this day. 
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The mid seventeenth century proved to be a major turning point in the history 
of the Fens when Dutch engineers were commissioned to begin work on Fenland 
drainage. Much of the major work was completed by the late seventeenth century, 
but in some areas drainage and reclamation were not complete until the early part 
of the twentieth century. A previously barely passable marshland evolved into 
fertile arable land. The impact of the reclamation on the Fenland’s demographic 
structure was considerable. Subsequent to drainage, the Fens saw quite rapid 
demographic growth, particularly in those central Fenland areas which had pre- 
viously been less accessible. The influx came from both east (Norfolk) and west 
(Peterborough and Lincolnshire), though the demographic evidence suggests 
that relatively few came from further afield than the surrounding counties (see 
Britain 1997: 19-20 for more detail about demographic growth and settler origins). 
Despite this, the Fens remain an important boundary to east-west communication. 
Politically the area is still very much a peripheral one. It sits at the north-western 
edge of East Anglia, and at the eastern and southern edges of the Midlands and 
the North. Road and rail links crossing the area remain relatively poor, and func- 
tionally, the absence of a large urban center in the Fens means its inhabitants look 
beyond the area to the east (King’s Lynn, Norwich) or west (Peterborough) for 
the provision of major products, services, and leisure facilities. 

Socioeconomic developments of the second half of the twentieth century, 
which had particular momentum in the South-East of England (see Allen et al. 
1998), opened the Fenland up to greater influence from the South. Especially since 
the 1980s, many people have moved out of the core south-eastern region of England 
to the Fens to take advantage of the cheap housing and a pleasant quiet environ- 
ment which nevertheless has fast connections by train south to London. Even highly 
rural and quite isolated areas of the Fens have seen dramatic demographic change 
over the past 20 years. Certain other local demographic factors have also shaped 
the potential linguistic influences on this speech community. Peterborough, on 
the western edge of the Fens, witnessed New Town development, expanding a 
middle-sized town of the 1960s into a city of over 150,000 people by the end of 
the century. It is highly likely that Peterborough acts and will continue to act as 
a significant “staging post” for the spread of linguistic innovations into the Fens, 
given its excellent service infrastructure, its relatively young multiethnic popu- 
lation, and its local reputation as a modern forward-looking and “connected” city, 
and is an especially attractive destination for young Fenlanders in search of bright 
lights and the big city. New Town development was, in the 1970s, supplemented 
by so-called “overspill” expansion - like New Towns in that large new residen- 
tial areas were built for former residents of urban areas, but not as grand in scale 
or provision. The overspill expansion — mostly from London — in the town of King’s 
Lynn at the eastern edge of the Fens consists of a very large housing estate on the 
edge of a medium-sized town. In the same way as Peterborough, the overspill is 
likely to act as a conduit for the spread of urban forms into more rural Fenland 
areas. Since 2004, the Fens have witnessed large and unprecedented levels of 
immigration from Eastern Europe following EU accession. Beider and Matthews 
(2006: 11) show that the numbers of overseas nationals working in the Fenland 
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rose by 650 percent between 2002 and 2005. As yet, no research has examined the 
linguistic consequences of this migration to the Fens. 

Mentioned earlier, in the discussion of the Dutch-German political border, was 
the fact that physical, social, and institutional barriers and breaks in routinised 
communication networks can cause dialect boundaries to be formed. The Fens in 
eastern England is the site of a large bundle of isoglosses separating East Anglian 
dialects from East Midland ones as well as “Northern” ones from “Southern” ones 
(see Britain 1991, 2001). This boundary can partly be explained by the fact that 
before Fenland reclamation east-west communication was extremely difficult, in 
addition to the rather negative perceptions that developed about both the place 
and the people. But the physical and perceptual boundary came to be used as an 
important political one too — the boundary that cuts through the Fens between 
Norfolk in the east and Lincolnshire in the west is not just one between two 
counties, but also between two administrative regions (East of England and East 
Midlands) — the Fens lie at the periphery of both, and while the effects of polit- 
ical boundaries between the two are less extreme perhaps than those of national 
boundaries, nevertheless institutional factors shape people’s spatialities in such 
a way as to reinforce the boundary — education, for example, is determined at the 
county level in England, and county boundaries therefore to a large extent deter- 
mine where the vast majority of 5—-16-year-olds will spend most of their day. In 
the sparsely populated rural Fens, children living near the county border will travel 
away from that border further toward the centers of their respective counties to 
go to school in a community where the majority of their peers do not live so near 
to the border. These physical and attitudinal and political reinforcements led 
to social routines and life paths respecting this boundary too, and thereby a set 
of self-perpetuating sociogeographical “grooves” developed (Cohen 1989) which 
reinforced the view of the “Fens-as-barrier” — the physical, social, institutional, 
perceptual effects are, therefore, context-creating and context-renewing with 
respect to the boundary. Below is a list of just some of the phonological, 
morphological, and lexical dialect boundaries that straddle the Fens: 


1 the presence or absence of /h/: largely absent to the west, largely present to 
the east; 

2 the realization of /au/: [e:] to the west, [eu] to the east (Britain 1991, 2003); 

3 the realization of vowels in unstressed syllables: past tense -ed forms and -ing 
forms are largely realized with [1] to the west, but [a] to the east; 

4 the preservation (east) or not (west) of a nose [nuuz]/knows [nauz] distinction; 

5 the realization of (u) in cup, butter, etc.: [vu] to the west, [a] to the east (Britain 
2001, 2002a); 

6 the realization of (a) in castle, last, etc.: [a] to the west, [ar] to the east (Britain 
2001, 2002a; the dialect boundary for this variable can be seen in Figure 10.2; 

7 the presence (east) or absence (west) of do conjunctions, as in Don’t stroke the 
cat do he'll bite you, where, as Trudgill (1995, 1997) explains, the conjunction 
derives from the grammaticization of a shortened form of because if you do; 

8 third person present tense -s absence (east) or presence (west); 
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Figure 10.2 The transition zone between short and long vowel realizations of /a:/ 
(as in bath and after), showing east/west differentiation (based on Britain 2001) 


Lincolnshire North Sea 


<10% 


@ Spalding 


@ King’s Lynn 


25-50% 
@ Downham Market 


@ Peterborough 


Norfolk 


@ Chatteris 


50-75% 


Suffolk 


Cambridgeshire 


Figure 10.3 The geographical diffusion of /1/ vocalization among older speakers in the 
Fens, showing north/south differentiation in rates of adoption 
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9 [e] forms of /ei/ in words such as take and make, present in west, absent in 
east; 
10 the use of while meaning ‘until’: Don’t come while four o'clock, present in west, 
absent in east. 


Very much more recent innovations from southern England, especially nonsalient, 
supraregional ones, such as the vocalization of /1/, have ignored these local con- 
textual conditions and created north/south differences in the Fens (see Figure 10.3), 
rather than east/west ones. These differential geographical patterns can, of course, 
only be fully understood through an appreciation of the contextual historical, social, 
political, attitudinal, economic, and geographical development of the area. 

We can now turn to examining evidence of the Fens as a site of contact. The two 
major (and very different) demographic upheavals that the area has experienced 
over the past 400 years — post-reclamation immigration in the seventeenth— 
nineteenth centuries and the late-twentieth century effects of mobility and New 
Town/overspill development — can be seen to have led to a number of different 
but typologically similar developments. 

Because the Fens-as-barrier, particularly before reclamation, had been so suc- 
cessful in separating East Anglian from East Midland dialects, when the two 
came into contact as a result of migration from either side onto the drained Fens, 
two quite distinctive varieties came to live side by side. The consequences were 
typologically very similar to those found in contexts where contact was caused 
by migrations of much greater distances, such as the European colonizations of 
the Americas and Australasia (e.g. Trudgill 2004), and the movement of inden- 
tured labor to plantations (e.g. Barz & Siegel 1988) (see also Kerswill 2002; this 
volume): leveling, interdialect, and reallocation. A number of these contact effects 
can be seen most vividly in the central Fens, the meeting place of the boundary 
features bulleted above (see also Britain 2010a). Here we see leveling of the more 
marked or stereotyped forms: 


Absent (or largely absent) from dialects of the central Fens, but typical of dialects 
to the east are: 

1 the nose [nuuz]/knows [nauz] distinction; 

2 third person present tense zero (e.g. he love, she like); 

3 the realization of /au/ as [eu] (Britain 1991, 2003). 


Present in northern and western varieties, but not usual in the central Fens are: 
1 [e] forms of /ei/ in words such as take and make; 
2 the use of while meaning ‘until’: Don’t come while four o'clock. 


In addition, contact has produced interdialect forms, variants that are linguistic 
hybrids of the input forms. So, in the case of /a/, realized as [u] to the north and 
west of the Fens and [a] to the south and east, the central Fens is focusing an 
intermediate [x] variant (Britain 2001, 2010a). And one further result of the post- 
reclamation dialect contact has been the emergence of an allophonic distribution 
of /ai/ similar to that found in Canada and many parts of the northern US. 
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Centralised [ai] variants are found before voiceless consonants and open ones 
[ai] before voiced consonants, /a/ and morpheme boundaries. So night time is 
often realized as [naiftaim]. This distribution, I have argued (Britain 1997), is the 
result of the reallocation of the open nuclei of /ai/ typical in all phonological 
contexts in dialects to the west of the Fens and the central nuclei typical of the 
east to different phonological environments in the central Fens (see also Britain 
& Trudgill 2005). 

The more recent contact that has resulted from late twentieth-century mobility 
has had similar linguistic repercussions. First of all, a number of consonantal 
innovations, apparently driven by diffusion from London and the South-East of 
England, are fairly rapidly affecting Fenland dialects (see, further, Britain 2005). 
These include: 


1 the fronting of /8/ and noninitial /6/ to [f] and [v] respectively: thing [fn]; 
father [farva]; 

2 the vocalization of /1/: bottle [bp?x], bell [bey], belt [bex?]; 

3 the use of labiodental [v] variants of /r/ in prevocalic position: red [ved], brown 
[bvaun]. 


A number of vocalic variants are also spreading rapidly to the Fens, variants which, 
like some of their consonantal counterparts, are also serving as more generalized 
supralocal variants of the South-East of England. It is notable here that these same 
innovations are affecting many other varieties of English outside the South-East 
and outside England: 


1 the fronting of /u:/: goose [gy:s]; 
2 the fronting and unrounding of /u/: good [gtd], books [brks]. 


This latter feature is particularly interesting since it highlights the importance of 
treating innovation diffusion as a contact process. The fronting and unrounding 
process has affected the eastern and central Fens quite considerably (Britain 
2005), but the north and west much less so. This is largely due to the fact that the 
north-west has not yet undergone the /u/-/a/ split, and so all /a/ words are 
still realized as [uv]. The fronting process of /u/ began in areas where the split 
had taken place and so was affecting a much smaller lexical class of words. It appears 
that not only is this change not affecting the whole much larger /u/ class in the 
north-west of the Fens, it perhaps not surprisingly is barely affecting any of the 
class at all, even those words which retain [vu] in the area with the split. Carfoot 
(2004) has confirmed this resistance to fronting in the Deepings, a group of vil- 
lages just to the west of the Fenland area. Innovations, then, come into contact 
with traditional forms as they spread. Sometimes the change is compatible with 
local phonological systems, and is adopted. But sometimes it is not, and this can 
lead to a slowing down or resistance to that innovation. Such contact between 
innovations from outside and local forms can, just like more extreme forms of 
contact, also produces hybrid patterns as a result. Three examples can demonstrate 
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this. The first concerns the widespread fronting of /au/, a change commonly found 
across the South-East of England: knows [nexz], nose [netz]. The eastern Fens, how- 
ever, for the most part retain the historical split (mentioned earlier in the context 
of East Anglia, but long merged in most of the rest of the South-East of England) 
between ME /ou/ and /s:/. /au/ fronting has therefore only affected the lex- 
ical set that was historically part of ME /ou/. Although reflexes of ME /3:/ are 
beginning slowly to merge with those of ME /ou/ in this part of the Fens, many 
speakers retain now ever more distinct lexically determined forms of /au/: 
knows [nez], nose [nxuz]. Another change affecting this area is the monophthon- 
gization of /ai/ (e.g. time [taim] — [ta:m]). We saw earlier, however, that /ai/ 
in the central Fens shows reallocation, with open nuclei only being found before 
voiced consonants, schwa, and morpheme boundaries. Consequently, the arrival 
of this innovation has only affected those environments and not others, with the 
effect that the two allophones are now even more distinct than they once were: 
night time [naiftaim] (Britain 2005). A relevant grammatical example is the pat- 
terning of past BE (see further Britain 2002b). Historical evidence shows that 
traditionally the Fens were rather a mixed area with respect to past BE. Klemola 
(2008) shows that the eastern counties of Norfolk and Suffolk (which govern part 
of the Fens) largely had a system of leveling to was (I was, he was, but also we was 
and they was) and Cambridgeshire and Huntingdonshire (which also govern a large 
part of the Fens) showed leveling to were (they were, we were, but also I were, she 
were). The vernacular, pan-East Anglian pattern that seems to have emerged over 
the past century is a reallocated past BE paradigm taking elements from both of 
these earlier systems: was is now the leveled form in affirmative contexts (I was, 
but also you was, we was, etc.) and weren't the leveled form in the negative con- 
texts (you weren't but also I weren’t, she weren't) (Britain 2002b). Consequently, today, 
leveled affirmative were forms are virtually nonexistent among younger speakers 
in the Fens, and leveled wasn’t forms quite rare. 

As a result of lack of contact, then, at particular periods in history, a number of 
structural dialect boundaries distinguished the south and east of the Fens from 
the north and west. As a result of dialect mixing, again at different times, and 
triggered by different demographic and socioeconomic factors, a range of contact 
forms developed there too. Because the spatialities of the contact were different 
at the different times, the geographical patterning of the contact outcomes differs, 
but typologically they are similar — leveling and hybridity are common con- 
sequences of the contact, both in the seventeenth to nineteenth centuries after 
Fenland drainage, and in the late twentieth century as a result of the encroach- 
ment of south-eastern regional mobilities in this area. Despite the typological 
similarity to the effects of mixing in many other places, the specifics of the contact 
outcomes are locally determined — contact processes take place on a dialectologic- 
ally diverse landscape and consequently, although there is an overall relative shift 
toward regional convergence (younger Fenlanders sound more like people from 
the South-East of England than older Fenlanders do), there are still a good number 
of local features being used, many of which have been created by the contact itself. 
One feature appearing to be particularly resistant to leveling at the present time 
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in the Fens is what Wells (1982) calls “generalized Yod Dropping,” the deletion 
of the post-consonantal glide /j/ in words such as music, few, huge, beautiful (e.g. 
few [fu]; music [mu:zik)). 

Recent research has shown that while many East Anglian locations appear to 
be suffering the obsolescence of this feature, it is robustly preserved at levels of 
over 85 percent in the Fens (Amos, Britain, & Spurling 2008). Similarly, recent 
research has also shown just how distinct the present-day London and Fenland 
relative pronoun systems remain (Cheshire, Fox, & Britain 2007). On the south- 
ern and eastern sides of the Fens, ongoing convergence with the South-East is 
likely in the future, as is, probably, convergence with the East Midlands for 
the west of the Fens — there seems little prospect, given the social, economic, 
and political-institutional circumstances at present of a dialectologically united 
Fenland. As rural Fenland children today go to school with many children from 
Poland, Latvia, and Lithuania, however, other contact effects may well continue 
to be a source of diversity as well as convergence in this small part of England. 


NOTE 


1 Reaction can also occur to varieties within a confined area, e.g. a city, initiating change. 
This has been shown by Hickey for the shifts in advanced Dublin English in the 1990s 
which, he argues, are best understood as a reaction against local features (dissociation) 
in Dublin English, see Hickey (2000) and Hickey (2005). 
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11 Contact and New 
Varieties 


PAUL KERSWILL 


1 Introduction: “New” Varieties 


The term “new variety” implies the convergence, by a population of speakers, on 
a set of linguistic norms which are collectively different from previous norms. 
There are two epistemological issues here. The first is defining what is meant by 
a “population.” Trivially, this can refer to some set of people who, in our case, 
stop speaking Variety A and start speaking Variety B. More usefully, the term 
“population” can be applied in the sociolinguistic sense of “speech community” 
— individuals having some affinity with others through sharing linguistic norms, 
both in terms of linguistic structure and in terms of patterns of variation and 
subjective evaluation. This essentially Labovian view (Labov 1989: 2; Patrick 
2002: 584-8) places the focus on collective behavior, and therefore allows for a 
time depth spanning generations — obviously essential if we assume that language 
change entails young speakers innovating, or at least adopting new features, dif- 
ferentiating them from their elders. 

The second epistemological issue is how to set criteria for a “new” variety. In 
a “normal” speech community, subject to no more than medium rates of in- and 
out-migration, language change is gradual and the concept of a “new” variety is 
irrelevant. The formation of a new variety (which may be a language or a dialect) 
involves more than just changes in norms. We need to envisage a prior period 
of relative absence of norms followed by focusing (Le Page & Tabouret-Keller 1985) 
— the reduction in the number of variant forms and the increase in sociolinguis- 
tically predictable variation, that is, the (re-)emergence of norms. Importantly, new 
varieties lack the inherent continuity (looking backward through time) of slowly 
changing speech-community norms (Kerswill 2002: 695-8). To use a medical 
metaphor, a new variety only emerges when a speech community has experienced 
trauma: through the overwhelming influx of newcomers, through the shift of its 
members to another language, or through the transplantation of individuals from 
different speech communities to a new location where they, as (voluntary or invol- 
untary) settlers, have to form a new community. This latter scenario gives rise to 
pigdinization or creolization in cases where no language is shared by a large enough 
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minority and where an economically or politically powerful language is a remote 
target. It also gives rise to koineization in cases where mutually intelligible vari- 
eties are spoken by the settlers: According to Trudgill (1986: 107), koineization 
starts out with the prior mixing of features from the different varieties, giving 
rise to a high degree of variability. This is followed by the reduction in the num- 
ber forms available through koineization. Koineization is the leveling of variant 
forms of the same linguistic items (especially phonemes and morphemes), and 
simplification — the reduction of phonological and morphophonemic complexity. 
This usually, but not necessarily, results in new, distinct, partly hybrid, partly inno- 
vative versions of the parent set of varieties, which are by definition more focused 
than the original mix (Le Page & Tabouret-Keller 1985). When focusing has 
occurred, the process is what Trudgill (2004: 89) calls new-dialect formation. In this 
chapter, much of what we will cover is concerned with this process: the forma- 
tion of what Siegel (1985: 364) calls immigrant koines —- a term synonymous with 
Trudgill’s “new dialects.” We will, however, extend Trudgill’s term to include 
another of the above “traumatic” scenarios: rapid changes in norms resulting 
from large-scale immigration, where a native-speaking population is numerically 
overwhelmed by a critical mass of incomers, to the extent that a new generation 
matures with a reduced exposure to the indigenous variety. 

New-dialect formation (henceforth NDF) is a subtype of change-by-contact, and 
so we need to delimit it from other subtypes. I have already distinguished it from 
pigdinization and creolization. To the extent that language change is externally 
motivated (and I would argue that nearly all of it is, at least in phonology — in 
inflectional morphology, word-formation, and syntax analogy is additionally 
operative), contact has almost everything to do with it: Innovations, once actuated, 
are spread through contact, typically but not necessarily face to face. A useful posi- 
tion is to restrict NDF to the results of human migration. Thus, besides internally 
motivated changes such as chain shifts, I also exclude changes that are the result 
of geographical diffusion, though relocation diffusion (the transfer of linguistic 
features through speaker migration — Britain 2002: 622) is of course central, because 
in some cases new varieties (immigrant koines) crystallize as a result of it. If it 
can be shown that a change takes place more or less simultaneously across a par- 
ticular region then this, too, is not our primary concern. Often, such pan-regional 
changes are part and parcel of regional dialect leveling (Kerswill 2003), or its near- 
synonym which emphasizes the adoption rather than leveling out of features, 
supralocalization (Milroy 2002): the amount of dialect/accent differentiation across 
a geographical region is reduced over time, giving rise to greater homogeneity. 
(These types of changes are dealt with in Britain, this volume.) Hickey (2003: 235-6; 
2010) describes a further type, leading to homogeneity across regions: supra- 
regionalization, referring to the adoption, feature by feature, of nonregional forms, 
without dialect contact and leveling taking place. Hickey describes this as taking 
place in Ireland, in whose capital, Dublin, a new “fashionable” variety is emerg- 
ing as a reaction to localized Dublin English incorporating both existing and inno- 
vative nonregional forms (Hickey 2000; 2005). Supraregionalization seems to be 
a process related to what Sobrero (1996: 106-8), in the Italian context, refers 
to as “koineization” — the adoption of regional varieties intermediate between 
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(standard) Italian and the dialect of large urban centers.’ As we shall see, all of 
these leveling/convergence processes interact with NDF. In fact, whether or not 
any of them is involved in NDF has proved controversial, as we shall see. 


1.1 New-dialect formation: a provisional model 


New varieties emerge, initially, through countless individual acts of linguistic 
accommodation (adjustments) performed by speakers of every age interacting with 
others, the adjustments being variously conscious (strategic, in a manner modeled 
by Communication Accommodation Theory — Giles 1973) and unconscious 
(through subconscious alignment — Pickering and Garrod 2004, cited in Tuten 2008). 
Accommodation may be responsive to context, or it may be long-term (Trudgill 
1986), leading to what Labov (2001: 415-17) calls “vernacular reorganization.” 
Accommodation takes place in a wider, but community-specific social context. 
Crucial community variables include: 


1 the proportions of children to adults in the initial stage (Kerswill & Williams 
2000: 90) 

2 degree of contact within and between age cohorts, especially in families 

3 relations between salient social groups (to the extent that these exist in a new 
community) 

4 the degree to which social group boundaries are, or become, sociolinguistic- 
ally marked 

5 wider linguistic ideologies (Milroy 1999) 

6 personal and social identity formation. 


Against this social backdrop, much of the outline of the new variety is determined 
by the relative frequency of linguistic forms heard among the population in 
the early stages, some of those forms being the outcome of accommodation, 
others belonging to a speaker’s quasi-permanent vernacular. This determinism is 
modified by the social and demographic factors just mentioned, some early in the 
koineization process, some later. In particular, if in the early stages of contact social 
divisions are relatively absent (as we shall see in the discussion in this chapter), 
then accommodation, leading to convergence, will prevail, especially among 
the crucial first child generation (cf. Kerswill & Williams 2000, and below). 
When social antagonisms emerge, whether they are new or reinforcements of 
divisions inherited from the “mother” country, then dissociation — the opposite 
of accommodation — may additionally occur (cf. Hickey 2005). 


2 Tabulae Rasae: South African Bhojpuri and 
New Zealand English 


In an imaginary experimental scenario, one might wish to observe the formation 
of a new dialect ab initio by depositing a population of people, carefully chosen 


Contact and New Varieties 233 


to be speakers of different dialects and screened to avoid reflecting existing socio- 
linguistic divisions, on a desert island, and then return after a generation or two 
to see the outcome. This scenario is, of course, impossible on ethical and prac- 
tical grounds. There are, however, a number of locations around the world where 
new dialects, isolated from their ancestral homelands, have been investigated in 
what Trudgill (2004) calls tabula rasa conditions, where speakers of the language 
concerned had not previously lived. Among other similar cases are German 
language islands in Eastern Europe and the USA (Rosenberg 2005) and Bhojpuri 
(Hindi) dialects in Fiji (Siegel 1987; 1997) and South Africa (Mesthrie 1991) (and 
elsewhere — see Trudgill 1986: 99-102). Historical work in this tradition is repre- 
sented by Tuten (2003), Lodge (2004) and Dollinger (2008). Bhojpuri speakers from 
India arrived in South Africa (KwaZulu-Natal) as indentured laborers from 
around the middle of the nineteenth century to around 1910 (Mesthrie 1991: 72), 
and many remained, forming distinct communities. South African Bhojpuri is a 
highly mixed variety, as Mesthrie points out: “SB [South African Bhojpuri] does 
not accord with any single language or dialect of North India, displaying — rather 
— a blend of features from several sources” (1991: 104). Mixing such as this is 
a further characteristic of koines, which we will return to later. The Bhojpuri 
studies in particular show that, in relation to many of the input dialects, there 
is simplification in morphology and morphophonemics, with a reduction in 
the number of morphological categories which are marked, as well as simpler 
paradigms. Thus, South African Bhojpuri shows no trace of the present negative 
copula (Mesthrie 1991: 98), and lacks “respectful” forms in pronouns and verbs 
(1991: 97, 100). However, these two features show a pattern which can be directly 
linked to the demographic and social make-up of the Bhojpuri communities. The 
present negative copula is a minority feature that would have been brought by 
Indian speakers of Eastern Bhojpuri, its absence being a consequence of the fact 
that these speakers were in a minority (1991: 104). Here, demography clearly played 
a role in a way which, as we shall see later, is of crucial importance. Secondly, 
“Tilt seems that this feature [respect marking] did not survive in the koine 
formation process in Natal, no doubt because of the levelling of social distinc- 
tions among [South African Bhojpuri] speakers” (1991: 100). Again, this is a 
critical observation: we will argue that the social upheaval caused by migration, 
and the need to establish a living in a new, often hostile environment, causes 
old social distinctions to be lost, and with them the linguistic marking of those 
distinctions. 

Finally, we look briefly at the demography of the early settlers in South Africa, 
to see if we can find an explanation for this simplification. Mesthrie (1991: 6) 
cites statistics indicating an overwhelmingly adult, predominantly male stream 
of migrants. Trudgill (forthcoming) suggests that “in fact simplification is most 
likely to occur in situations involving language learning by adults, who are 
typically poor second-language learners as compared to small children, particu- 
larly so far as informal acquisition in short-term contact situations is concerned” 
(see also Trudgill, this volume). The theme of child learners versus adult learners 
is one we will return to. 


234 Paul Kerswill 


The Bhojpuri studies do not give us clear information on the original settlers 
or their immediate descendants (the first generation of children born in the new 
location). A study which comes close to this is the investigation of the early stages 
of New Zealand English from the 1840s on by Elizabeth Gordon, Peter Trudgill, 
and their associates (Trudgill et al. 2000; Gordon et al. 2004; Trudgill 2004). The 
uniqueness of the Gordon-Trudgill study is that it was based on oral history record- 
ings of elderly New Zealanders made in 1946-8.* Crucially, those recorded 
included people born in the period when New Zealand English was, according 
to Trudgill (2004: xi), being formed: 1850-90. The recordings thus afford a win- 
dow on the speech of the earliest children growing up in colonial New Zealand, 
albeit sampled in their late adulthood. Gordon et al. (2004) present a comprehensive 
report and discussion of the findings. Trudgill (2004), however, presents the argu- 
ment that new-dialect formation is almost entirely deterministic in the sense that 
the shape of a new dialect can be predicted with some precision from a know- 
ledge of the input varieties — the dialects of the original immigrants. This theory 
relies on careful consideration of three stages of new-dialect formation, and argues 
that the types of accommodatory behavior and selection of linguistic variants at 
each stage proceed in a predictable way which is largely unconnected with social 
factors — other than demography. Before critiquing his position, I will outline 
Trudgill’s stages (see Table 11.1). These correspond roughly to the first three 
generations of speakers (Trudgill et al. 2000). At Stage I, adult migrants will level 
away features which are in a small minority in the mix of dialects they encounter, 
subject to the individual's ability to do so (adults are generally less successful 
than children in modifying their language, especially phonology; Trudgill 2004: 
89-93). In the New Zealand case, these would have included traditional dialect 
features. There would have been a great deal of inter-individual variability at 
this stage, and people would themselves have been inconsistent in their usage 
(intra-individual variability). 

At Stage II, the demographic distribution of features begins to determine the 
shape of the focused variety which is still to appear. The Mobile Unit oral history 
recordings represent this stage. According to Trudgill, the absence of a stable adult 
norm, or even a peer-group dialect, means that children pick features to some 
extent “at will,” “from a kind of supermarket” (2004: 103, 108). The reason for 
this is that, in this tabula rasa context, they are not influenced by prestige or 


Table 11.1 Trudgill’s three stages of new-dialect formation 


Stage Speakers involved Linguistic characteristics 

I Adult migrants (first generation) Rudimentary leveling 

II First native-born speakers (second Extreme variability and further 
generation) leveling 


Ill Subsequent generations Focusing, leveling, and reallocation 
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identity-marking functions (pp. 151-7). Trudgill states that these young speakers 
did not “indulge in long-term accommodation to one another” (p. 108), but 
rather that they selected features based on their frequency, always allowing for 
a “threshold rider” which filters out relatively infrequent features (p. 110). As 
at Stage I, speakers show considerable inter- and intra-individual variability, 
much more so than in a community where the transmission of norms has been 
“normal” in the sense of cross-generational transmission, defined by Thomason 
and Kaufman (1988: 9-10) as taking place when “a language is passed on from 
parent generation to child generation and/or via peer group from immediately 
older to immediately younger.” Examples of this variability in two small com- 
munities are as follows (from Kerswill & Trudgill 2005: 209-10): 


(1) Inter-individual variability in Arrowtown: 
GOAT: [o'] [ou] [ou] [5u] [eu] [eu] 


(2) Intra-individual variability in the speech of Mr Riddle, Palmerston: 
/et/ and /au/ as in FACE and GOAT alternate between Scottish-sounding 
monophthongal pronunciations with [e] and [o] and very un-Scottish 
pronunciations with the wide diphthongs [eer] and [eu]. 


Stage III represents the focusing of the new variety, with alternate realizations 
leveled out, leaving only one, or two in the case of reallocation where variants are 
“reallocated” to a linguistic or sociolinguistic function (Britain & Trudgill 1999: 
245). In New Zealand, this resulted in a very homogeneous variety, apparently 
by 1900 (with the exception of the preservation of rhoticity in the far south), though 
with considerable social variation in terms of accent. Trudgill (2004: 115-28) argues 
strongly for a purely quantitative explanation of the features adopted by the 
new variety, citing several vowels and consonants including the retention of /h/, 
the maintenance of the /m/—/w/ distinction (as in which/witch), the merger of 
unstressed /a/ and /1/ on /a/ (as in rabbit), and broad diphthongs in words of the 
GOAT, FACE, MOUTH, and PRICE sets. In all but one of these, Trudgill adduces 
a simple majority principle: In the Mobile Unit recordings, the presence of these 
features is more common than their absence. In the case of the merger of /a/ and 
/1/, /a/ was not in a majority, with only 32 percent of tokens having this vowel. 
Here, in a rather post hoc manner (and he admits his argument is not strong on 
this point — Trudgill 2004: 120), he appeals to a markedness principle: /a/ is less 
“marked” in this position than /1/, and this helped guarantee its survival. 

An important component of Stage III is the notion of “drift,” to explain appar- 
ently parallel developments in early New Zealand English and English in 
England. Drift, according to Trudgill, refers either to the continuation of a change 
in the “home” country or a tendency or propensity for a change (Trudgill 2004: 
132-3). Hickey (2003: 229-35) agrees with the quantitative facts, but argues that 
the idea of “inheriting” a change is an unwarranted reification of language. He 
prefers to see changes in terms of individual speakers, especially children, detect- 
ing what is innovative and what conservative, or reinterpreting small phonetic 
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changes as a shift of phonemic status. The latter would be the case in the parallel 
shift in both England and New Zealand from lax /1/ to tense /i:/ in words like 
coffee (HAPPY-tensing). Hickey reconceptualizes Trudgill’s explanation, but he does 
not detract from Trudgill’s essential points about the dynamic relationship 
between the emergent dialect and its British antecedents. 

There remains the problem of how to explain the almost complete lack of regional 
variation in New Zealand English. Trudgill provides a partial answer. In account- 
ing for the similarities between Australian and New Zealand English, he quotes 
a mixing-bowl metaphor from Bernard 1981: “| T]he ingredients of the mixing bowl 
were very much the same, and at different times and different places the same 
process was carried out and the same end point achieved” (Trudgill 2004: 161). 
Trudgill points out that the dialect mixes in the different settlements in New Zealand 
were not in fact exactly the same, so other factors must be brought to bear. New 
Zealand was a mobile society, despite the considerable distances, and Trudgill 
agrees with Britain, who writes, “settlement isolation, mobility, transience and indi- 
vidualism led to the emergence of an atomistic society freeing people both from 
subservience and from the need to conform that tight-knit local communities often 
engender” (2005: 164-5, referring to Fairburn 1982). Mobility in such a society, 
with a lack of local speech norms, would, it is argued, quickly lead to uniformity. 
At the same time, social prestige played no part, as Trudgill argues for both Stage 
II and Stage III, because children would not have been exposed to a standard 
ideology in New Zealand at that time — and in any case, children align themselves 
linguistically with local speech, especially that of their peers. Britain (2005: 165-6) 
points out that, in the early decades of European settlement, there was no com- 
pulsory education and literacy was low, so that overt pressures from prestige 
varieties could scarcely have had any effect. Both literacy and English-style social 
stratification came a little later, and in any case after the earliest settlers’ children 
had begun the process of koineization. Moreover, New Zealand was engaged in 
what Belich (1996: 330) calls “custom shedding ... Highly overt class differences 
... were leading candidates for the discard pile” — echoing the Bhojpuri communities’ 
leveling of social distinctions (though in their case they were in a subordinate social 
position ab initio). Together, social structure, lack of overt norms, the seemingly 
random choice of forms at Stage II, and a choice of forms at Stage III based solely 
on relative frequencies of forms used by the Stage II speakers all contributed to 
a rapid focusing on a set of features with origins in various British regional vari- 
eties (Trudgill 2004: 114-15). Even so, the homogeneity is much greater than would 
be expected, and we need to account for this. First, we will take a closer look at 
the “social factors” rejected by Trudgill. 


3 Social Factors versus Determinism in Tabula 
Rasa New-Dialect Formation 


We can now relate Trudgill’s findings and arguments to the model of new-dialect 
formation I presented at the end of section 1. That model allowed for “social 
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factors” to affect the process at different stages. Trudgill states that such factors 
did not affect children’s dialect development at either Stage II or II] in tabula rasa 
new-dialect formation. His argument is that the social set-up of early colonial New 
Zealand meant that “prestige” (including covert prestige), “stigma,” “identity,” 
and “ideology” counted for nothing in what motivated children to adopt the 
features they did. Instead, he appeals to the notion of “behavioral co-ordination” 
(citing Cappella 1997; see also Pickering & Garrod 2004, cited by Tuten 2008). This 
motivates people to converge behaviorally as the default, leading to the quanti- 
tative results Trudgill claims to find. At face value all this is easy to accept — but 
only if we accept all of Trudgill’s assumptions about the nature of early anglo- 
phone New Zealand. The problem lies with the conceptualization of the tabula 
rasa. The Stage II scenario as outlined by Trudgill was most likely fairly rare. Clearly 
it existed in the very first English-speaking settlements where children were present 
(and not in the earlier, male-only, transient whaling communities). After this, it 
would have existed only in relatively small settlements populated by immigrants 
arriving at roughly the same time, with relatively few newcomers arriving 
later. I say “relatively,” because in an isolated farmstead there would be much 
less variability, and because a later, massive influx could swamp the existing, small 
founder population. Conditions for Trudgill’s Stage IT would not have existed in 
the larger settlements, except of course on their founding, and a complex social 
make-up, including institutionalized ethnic divides - much of it aimed at Irish 
immigrants, who needed a permit of stay (Hickey, p.c.) — quite quickly emerged 
in some of the new urban communities. According to Belich (1996: 405), in 
Christchurch social stratification, some of it imported, apparently deliberately, 
from Britain, was present very early. Christchurch was founded as a Church of 
England settlement, and its ethos is shown by early complaints about Australian 
and “half-breed” incomers (Belich 1996: 339; see also Kerswill 2007). Meyerhoff 
(2006) argues strongly for a more complex and nuanced view of the early stage. 
She points out that several of the variables Trudgill investigates in fact show 
significant effects of parents’ origins (English, Irish, Scottish or Australian) for these 
same Stage II speakers — the model predicts the absence of such effects. It is 
likely that in a diffuse speech community children will be more, rather than less, 
influenced by parental varieties, as has been observed in two studies of new-dialect 
formation in Norway (Kerswill & Williams 2000: 75). Meyerhoff also points out 
that children were born to large families with a great age range among the 
children, and that parents and older siblings would, like everywhere else, use a 
range of contextual styles which would be detected by children. In such com- 
munities, children at both Stages II and III would have coexisted, even within the 
same family, a fact which strongly curtails the time span of the social-factor-free 
Stage II. 

Where does this leave the quantitative, deterministic account? Hickey (2003) 
argues that, because of the high proportion of single men and women, the impact 
of Irish settlers would have been less than their relatively modest numbers pre- 
dict; this seems to me not to be particularly relevant, simply because Trudgill’s 
model excludes minority influences and the Irish were in a minority anyway. Almost 
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all of Trudgill’s features support his model, or at least don’t contradict it. The 
outcome, according to Trudgill (2004: 157), is a composite in which Stage III 
children selected “upper-class H Retention, lower-class Diphthong Shift, urban 
nonrhoticity, and the Rural Weak Vowel merger, from all the features available 
to them.” This, in his view, guarantees that they were not motivated by any pres- 
tige or identity-based factors; elsewhere, he argues: “But this kind of baggage is 
not relevant to 7-year-old children in the colonial situation — which is precisely 
why the mixing of variants from different dialects of different regional origin and 
different degrees of social status, with a mass of different associations and con- 
notations, always takes place” (Trudgill 2008: 279). As I have indicated earlier, 
I agree with Trudgill in that it is difficult to argue that normative pressures from 
the English-based education system and British ideas of acceptable public usage 
could have had more than a limited effect on this particular outcome (Trudgill 
excludes this possibility entirely). However, as the discussion in the previous para- 
graph suggests, children did not grow up in a social vacuum, devoid of adult 
intervention and isolated from adult norms. Here, one would have to envisage an 
even more extreme experiment: the depositing of a population of pre-adolescents 
on a desert island, Lord of the Flies-like. 

A tabula rasa quickly ceases to be a clean slate (if it ever was, since adults in a 
new community are not entirely deracinated). It is wrong to say that social factors 
had no effect in the early period. The question is, which social factors? Prestige, 
and probably identity formation, can certainly be ruled out at the very start of the 
koineization process, but parents’ inherited ideas about good and bad behavior, 
and acceptable linguistic practices including politeness, would have been brought 
to bear. The majority of New Zealand children at Stage II and certainly Stage III 
would not have been growing up in the idealized situation Trudgill envisages; 
indeed, the Mobile Unit recordings are heavily biased toward the rural and so 
are not representative of how the early settlers lived, even if taken as a whole 
the balance across the regions of the British Isles did reflect the total immigrant 
population of the country. Our conclusion must be, however, that Trudgill’s 
three-stage deterministic model is correct, but only in the rarefied, hypothetical 
situation such as the imaginary experiment mentioned earlier. In the real case 
Trudgill presents to us, the outcome is, indeed, pretty much as predicted. How- 
ever, the model Trudgill gives us is quickly “messed up,” as I have argued. Trudgill’s 
idealized route, with its absence of identity factors (note my restriction to this 
type of social factor), doubtless pertained in a few places, with the linguistic 
outcome closely corresponding to focused New Zealand English. But it existed 
alongside a vast majority of socially complex, more “normal” situations. Thus, 
while accepting the numerical model, it is difficult to accept its detailed imple- 
mentation “on the ground.” 


4 Homogenization in New Varieties 


All this said, there is still a serious gap in our understanding. The geographical 
homogeneity of New Zealand English is not merely a relative matter (compared 
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to, say, Great Britain), but (barring Southland rhoticity) virtually total. Phonetic 
divergence has only recently emerged (by all accounts) in the form of distinct Maori 
and Pasifika Englishes (Hay, Maclagan, & Gordon 2008: 105-9). The mixing-bowl 
metaphor does not in itself predict total homogeneity, but differences resulting 
from any variations in the ingredients. Thus, in a highly mobile and atomistic 
society, we might expect regional varieties to have emerged around Auckland and 
the other main centers, in the manner of the regional dialect leveling in southern 
England today (Williams & Kerswill 1999; Kerswill 2003). However, this is not 
the case. Hickey proposes a mechanism of supraregionalization to account for this: 
“[D]ialect speakers progressively adopt more and more features of a non- 
regional variety which they are in contact with. There does not have to be direct 
speaker contact . . .” (2003: 236). In Trudgill’s Stage III, the new variety can “be seen 
as a product of unconscious choices made across a broad front in a new society 
to create a distinct linguistic identity” (Hickey 2003: 215). This supraregional 
variety, according to Hickey, would have emerged in the “melting-pot” settle- 
ments, which had mixed populations of relatively high density and size but with 
similar mixes, and then spread to the much more dialectally distinctive rural 
settlements. This process can be observed in contemporary Europe, notably in 
Denmark, where regional dialects have all but given way to a non-conventionally 
prestigious but “modern” Copenhagen-based variety (Pedersen 2005; Kristiansen 
and Jorgensen 2005). But for New Zealand English, in spite of high mobility, it is 
extremely difficult to see how sufficient familiarity with the new variety could have 
come about to act as a reliable model. We cannot be certain how or exactly when 
the entirely nonregional variety appeared, but it is necessarily distinct from the 
focusing at local or regional speech community level which presumably preceded 
it. We should therefore introduce a Stage IV, at which new-dialect formation is 
already complete at the local/regional level, and at which supraregionalization 
is about to set in. 

Demographic and other social factors — especially gender and class — mentioned 
by Gordon (2008) guided the spread of New Zealand English, either promoting 
it or hindering it. Crucially, they did not actually determine its form. Instead, we 
have an image of the fully formed variety spreading inexorably, meeting varying 
degrees of resistance. Interestingly, Arrowtown (a focus of the Gordon-Trudgill 
studies) is one of the pockets where Trudgill’s model seems to have applied; but 
in the context of New Zealand it was too small to have been a center of influence 
in itself. What Gordon describes matches rather closely what Hickey means by 
supraregionalization. We have no access to its mechanism in those early days; 
fortunately, we have the modern sociolinguistic record of countries like Denmark 
and (to a more limited extent) Ireland, where the same process has been documented 
in recent times. 


5 Koineization in New Towns 


The second type of new-dialect formation we will discuss is that of the new town, 
which we take as representative of new settlements where there are prior speakers 
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of the language concerned, and where, therefore, some degree of face-to-face con- 
tact with existing speakers, either in the new location or the “home” location, is 
maintained. The term “new town” is a designation in official UK usage for a new, 
planned urban settlement placed on previously more or less unoccupied land, 
built throughout the twentieth century but particularly since World War II. The 
earliest new towns described in the sociolinguistic literature are, however, in 
Norway, where several were established at the heads of fjords in the period 1910-20 
to harness hydroelectric power for the smelting of various ores. These included 
Odda and Tyssedal (east of Bergen), Sauda (south-east of Bergen), as well as 
Heyanger (in the Sognefjord, north of Bergen). The first two of these were inves- 
tigated by Sandve (1976), and an extensive report and interpretation of his 
findings appear in Kerswill (2002: 674-7), with a focus on dialect mixing, inter- 
dialect (intermediate, compromise) forms, simplification, and the very marked effect 
of differences in the geographical origins of the in-migrants to the two towns, which 
are located only 5 kilometers apart. Hayanger was discussed extensively by 
Trudgill (1986) as a prototypical case of new-dialect formation. Basing himself on 
a short publication by Omdal (1977), he established the stages which he later applied 
to New Zealand. Omdal noted a transition from the traditional rural dialect of 
the oldest speakers, who grew up before the establishment of the town, through 
the extreme linguistic heterogeneity of the children who grew up in the new town, 
to a more stable new dialect spoken by the third generation — who were young 
adults in the 1970s. So far, the scenario is similar to New Zealand, with the import- 
ant exception that some of the children growing up were the offspring of the 
original population. These people’s speech was mixed, but was closer to the old 
dialect than was that of the descendants of incomers. In 2001, Randi Solheim 
conducted sociolinguistic interviews in Hoyanger, as well as obtaining archive 
recordings and previous descriptions, giving her a real-time window of 45 years 
(Solheim 2006; 2009). 

As expected, the new dialect includes features from both major input varieties, 
West Norwegian (the dialect area in which the town is located, and from where 
the vast majority of in-migrants arrived) and East Norwegian (including the 
capital, Oslo, from where many of the managers and engineers came). However, 
despite the relatively small number of people from the east, a large number of 
high-frequency words have an eastern form. As in most koines, there are inter- 
dialect forms (Trudgill 1986: 62-5). In Heyanger, these are a compromise between 
the two main dialect sources, often involving the blending of a western stem with 
an eastern inflection — or vice versa. The eastern forms which also represent 
simplifications (particularly loss of velar—palatal alternation and umlaut in verbs) 
are now widespread in western Norway, as part of regional dialect leveling (Kerswill 
2002: 684) — a point I will return to in a discussion of the new dialect of Milton 
Keynes. However, most of the eastern forms, including the pronominal forms dere, 
de (/di:/) and noen, are rare elsewhere in the region, thus demonstrating that the 
dialect is a koine. 

I mentioned earlier that the Hoyanger study gives us a direct window on social 
influences on the outcome of new-dialect formation. We have already seen that 
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Table 11.2 Evolution of two salient Hoyanger variables 


East West Generation I Generation II Generation Gloss 
Norwegian Norwegian (rural dialect) TI 
and bokmal and nynorsk 


ikke /1ka/ ikkje /1¢a/ ikkje ikke and ikkje — ikkje negator 
jeg /jeer/ eg /exg/ eg jeg and eg eg ‘TY 


Note: Usage in Generations IT and III was variable, as expected in the early stage of a 
mixed dialect. By Generation III, the eastern forms had all but disappeared. 


the number of high-frequency eastern forms is disproportionate to the number 
of in-migrants from that region: Solheim attributes this to the high social pres- 
tige of those people. However, a number of items initially took an eastern form, 
only to be replaced by the original western realization. Table 11.2 shows two such 
instances, where there has been a shift from the original dialect forms, through 
East Norwegian forms, back to the original variants. Solheim uses a terminology 
similar to that of Trudgill, labeling the stages “generations” in order to link 
individual life experiences more directly with developments in the town itself. 
“Generation I,” however, refers not to the in-migrants, but to the existing dialect- 
speaking population. 

Solheim interprets this shift to the notion that the East Norwegian forms, being 
associated both with the former managerial class and Standard Norwegian in 
its prestigious bokmal instantiation, are too strong markers to be acceptable in a 
West Norwegian dialect. This point of view is indeed expressed by some of the 
informants themselves. It is worth noting that these two items, the negator and 
the form for ‘T’, are regularly cited by Norwegians as regional and social dialect 
markers, doubtless partly because they are also associated with the two versions 
of Standard Norwegian — nynorsk (mainly western and rural) and bokmal (eastern, 
northern, and urban). This is an extremely clear indication that language ideology 
may be a direct motivation for a wholesale change in linguistic usage. 

There is evidence, too, of local identity formation guiding changes in usage. 
Two diphthongs, /z1/ and /gy/, originally had fairly localized realizations as 
[at] and [dy], respectively, setting them apart from much of the rest of the country 
which has realizations in the region of [1] and [gy]. The supralocal variants of 
both diphthongs were in a large majority in Generation II, only to be largely replaced 
by the local form in Generation III. Meanwhile, Solheim has real-time evidence 
that Generation III speakers actually increased their use of the supralocal variant 
during their lifetimes, while today’s youngest speakers, Generation IV, seem to 
be increasing their use of the local variant of /oy/. They do this most markedly 
in the place name Hoyanger, insisting explicitly that [hoy'anar] is the correct pro- 
nunciation (Solheim 2008: 6). 
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The advantage of the Heyanger studies is that it is possible to show how 
local social factors affected outcomes, thus confirming some of the criticisms of 
Trudgill’s deterministic model that it does not pay attention to the mechanism of 
transmission at Stages II and III. Solheim notes that 


a central observation emerging from my work on more recent data from Generation 
II, not least from my encounters with the speakers themselves, is that individuals’ 
personalities and life worlds are, to a great extent, decisive influences on their 
language use. It is likely that background factors such as these are particularly 
significant for this generation, since, in their formative years, there was no stable, 
local linguistic norm on which they could rely. [my translation]. (Solheim 2006: 237, 
my translation) 


There is every indication that the outcomes in places like Heyanger, Odda, and 
Tyssedal are largely predictable using a quantitative model. Solheim shows that 
there is a range of individual social factors, including social class, personality, and 
even identity formation, which affect outcomes for individuals, perhaps especially 
in Generation II. However, collectively, individual outcomes can only affect the 
direction of focusing if they are affected by the same large-scale social factor. Solheim 
mentions ideology in the shape of attitudes, as well as local identity — both 
operating at Generation/Stage III; social class actually prevented focusing until, 
under conditions of post-World War II social democracy, it became irrelevant. It 
can be argued that all this simply isn’t applicable to tabula rasa situations like 
New Zealand. However, as we have seen it can equally be argued that ideology 
(but not identity formation) played a part there as well from the very start, even 
within families and small communities — though naturally we have no evidence 
either way except for the speech of Stage II speakers in old age. Language ideo- 
logies rapidly gained a foothold in the main centers, and would have coincided 
even with the earliest Stage II speakers’ childhood and adolescence. That said, in 
the New Zealand case, I agree with Trudgill in rejecting identity, even local, as 
a motivation for the focusing on the mixed norm which took place, if for no other 
reason than that there is no evidence of local dialects ever having formed before 
supraregionalization set in. Perhaps we haven’t been able to observe that stage — 
though Gordon (2008) brings us close. 

Milton Keynes is the newest, and largest, of Britain’s new towns, designated 
in 1967. It is situated some 80 kilometers north-west of London, in an already exten- 
sively leveled dialect area. It is situated on the boundary between what Trudgill 
(1990: 63) identifies as the South Midlands and Home Counties (i.e. south-eastern) 
Modern Dialect areas. We have already seen how the Hgeyanger dialect partly adum- 
brates leveling changes happening in its own still strongly dialectal region. What 
is the situation with a much more highly connected new town in an already leveled 
region? 

Population statistics reveal a rapid increase, particularly from the mid 1970s 
to the late 1980s, as follows (from Milton Keynes Intelligence Observatory, 
www.mkiobservatory.org.uk): 
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Year: 1967 =: 1971 1977 = 1987 1997 2007 
Population: 60,000 66,800 96,300 161,500 196,920 227,796 


Sociolinguistic recordings were made in 1991 and 1992, some 24 years after desig- 
nation and 14 years after large-scale in-migration had got under way. Children 
and young people at that time were representatives of Stage II, their parents Stage 
I, an assumption corroborated by the fact that only 10 of the 48 adults recorded 
were born in Milton Keynes. The primary interest was to observe new-dialect 
formation as it was actually happening, rather than two or more generations later. 
The project could not, of course, observe the fully focused outcome, since Stage 
III had not been reached, though as it turned out focusing was already well under 
way at Stage II.* The sample was composed of eight girls and eight boys in each 
of three age groups of 4, 8, and 12 years old, in addition to the principal care- 
giver (46 mothers, 1 father, 1 aunt). The families were selected to be classifiable 
as “working class.” 
The project was informed by a number of “Principles of koineization”: 


Outcomes in post-contact varieties: 

1 Majority forms found in the mix, rather than minority forms, win out. 

2 Marked regional forms are disfavored. 

3 Phonologically and lexically simple features are more often adopted than 
complex ones. 


The migrants and the first generation of native-born children: 

4 Adults, adolescents, and children influence the outcome of dialect contact 
differently. 

5 The adoption of features by a speaker depends on his or her network 
characteristics. 


The time scale of koineization: 

6 There is no normal historical continuity with the locality, either socially 
or linguistically. Most first- and second-generation speakers are oriented 
toward language varieties that originate elsewhere. 

7 From initial diffusion, focusing takes place over one or two generations. 

8 Because of sociolinguistic maturation, the structure of the new speech com- 
munity is first discernible in the speech of native-born adolescents, not young 
children. 


Not surprisingly on the basis of our previous discussion, Principles 1 and 2 can 
be shown to apply in Milton Keynes. Little can be said about Principle 3, because 
most of the input dialects were already leveled south-eastern ones. These prin- 
ciples are discussed in Kerswill and Williams (2000: 85-9) and will not be treated 
further here. Principle 4 could also not be fully addressed, though it was clear 
that the high proportion of children to adults relative to the general population 
would favor early focusing (26.1% under-16s compared to 20.1% for England and 
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Table 11.3. Key to GOAT-fronting 


Degree of fronting Value Location/type 

(ou) - 0: [or], [ou] score: 0 (Northern and Scottish realization) 
(ou) - 1: [av], [ay] score: 1 (older Buckinghamshire and London) 
(ou) - 2: [ay] score: 2 (fronting) 

(ou) - 3: [ar] score: 3 (fronting and unrounding) 


Wales). However, Trudgill (2002; this volume; forthcoming) has reasoned that in 
cases where dialect contact and language contact are relatively sustained and involve 
children, complex features may be learned more easily than when contact mainly 
involves adults. 

Principles 5 and 8 can both be demonstrated by an examination of index scores 
on a phonetic variable, the fronting of the vowel of GOAT. Table 11.3 shows the 
values for this variable. 

Kerswill and Williams (2000: 93-4) show, first, that it is the 12-year-olds who 
have the highest scores, the girls exceeding the boys. Second, the older children 
have greater fronting than their caregivers, the conclusion being that this age group 
is propelling the change. Third, there appeared to be a group of low scorers and 
another of high scorers in this age group. The low scorers (mean score 1.3) are 
two boys and two girls, and appear to be socially quite isolated individuals. The 
high scorers are four girls (mean 2.1), and form a group of friends who are sociable 
and well integrated at school. Given that the fronting of this vowel is an ongoing 
change, the obvious conclusion is that it is female-led. What this means for new- 
dialect formation is that “integrated” children with broad social contacts are in 
a position to focus the new dialect forms; in the sample, all the other children — 
be they 4-year-olds or the less-integrated older children — are linguistically more 
heterogeneous, showing features of their parents’ accents more strongly than 
the integrated children, and are clearly not in the lead in the focusing process. 
Interestingly, this shows that focusing can start at Stage II. It also confirms 
Principle 8 in showing older children to be in the vanguard (see Kerswill & Williams 
2005: 1026-32 for a fuller discussion). 

We turn now to Principle 6. A marker of a new dialect is the absence of any 
stable, locally based dialect to serve as a model for acquisition. This implies that 
there is a break in continuity between generations — not of language acquisition, 
which would lead to pigdinization or creolization, but of the transmission of 
local dialect features. This is as true in new towns as in tabula rasa situations. 
The “new dialect” status of Milton Keynes English can be confirmed by examin- 
ing the diphthong /au/ of MOUTH. There are a number of variants of this vowel, 
which appear to be converging on a Received Pronunciation-like [au], moving away 
from local pronunciations such as [er] and [ey]. 
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(Most 3 
fronted) 


2.0 


—@— caregivers 


(ou)index 
bi 


—o— children 
1 
0.5 
(Least 
fronted) 0 


Subjects 


Figure 11.1 Association of Milton Keynes children’s scores for GOAT fronting with 
those of their caregivers (from Kerswill & Williams 2000: 102) 


The striking point is that there is an abrupt and complete disjunction between 
the variants, first between pre-Stage I (the original inhabitants of the pre-new town 
district) and Stage I (the adult in-migrants), and then again between Stage I 
and Stage II: None of the Stage I or Stage II speakers uses the two favored, 
conservative variants of the pre-Stage I people. A parallel study in Reading, a 
well-established town in the same region, showed a similar trend, but with the 
conservative tokens still in use, albeit at a low level, by the youngest speakers. 
This demonstrates continuity, absent from Milton Keynes. 

Principle 7 can be illustrated, again using the GOAT-fronting data. Figure 11.1 
shows the index score for all the 48 children, ranked from high to low. Against 
each child the caregiver’s score has been plotted. Two patterns stand out. The 
first is that there is no obvious link between the adults and children, with seven 
adults having scores close to zero (representing Scottish and northern English 
pronunciations). Second, the average score for the children is higher than for the 
caregivers, suggesting change (as we have seen). Further to this is the fact that two 
children have a score close to zero. This comes as no surprise when we realize 
that these are 4-year-olds, matching their caregiver. We assume from the data 
for the older children that these children will align themselves with the other 
children as they grow older; for the one child re-recorded 18 months later, this is 
indeed the case, on this feature as on others. The result is greater homogeneity 
among older children, a shift which is necessarily much greater in a new town 
than elsewhere, because almost all of the adults are from elsewhere. This also, of 
course, supports Principle 8. 
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6 New Varieties and Migration 


It is not usual to consider changes caused by immigration to an existing speech 
community a case of “new-dialect formation”; however, in the last 20-30 years 
substantial and rapid realignments in the phonologies, and to a much lesser 
extent grammars, of urban dialects have been observed in north-west European 
metropolises. Recent work in Copenhagen (e.g. Quist 2008), Stockholm (Bijvoet 
& Fraurud 2006), and Oslo (e.g. Aarszether et al. 2008), as well as cities in the 
Netherlands and Germany, suggest such a development in highly multiethnic 
districts. We will take the example of London, a city with a long history of immi- 
gration (and in-migration from other parts of the British Isles). Various incoming 
groups are said to have influenced London English, but it is only with the very 
large-scale immigration post-World War II and especially since the late 1950s that 
we see verifiable change. We will take the example of diphthongs in London English, 
which are usually said to be subject to Diphthong Shift (Wells 1982: 306-10). This 
means that the vowels of FACE, MOUTH, PRICE, and GOAT have long trajectories, 
a development which has been taking place over a century or more and is a 
further development of the Great Vowel Shift. Figure 11.2 shows vowel plots for 
an elderly working-class Londoner (Kerswill, Torgersen, & Fox 2008: 457).* If one 
reads this as if it is a traditional vowel diagram, these characteristics can be clearly 
seen. The lines represent the trajectories of the diphthongs. 


F2 
2500 2300 2100 1900 1700 1500 1300 1100 900 700 an 
+ 300 
—>— CHOICE 
—o+ FACE 
y 400 —*— GOAT 
—o— MOUTH 
_. | —— PRICE 
oe ae @ DRESS 
@ TRAP 
L 600 + STRUT 
7 Mm START 
A FOOT 
a + 700 
800 


Figure 11.2 Vowel plots for an elderly male speaker from Hackney, born 1938 


Contact and New Varieties 247 


F2 


2500 2300 2100 1900 1700 1500 1300 1100 900 700 500 
1 1 1 1 1 1 1 1 1 200 
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—+- FACE 
a [ 400 —*— GOAT 


—O— MOUTH 


—*— PRICE 
e + 500 & 
@ DRESS 
@ TRAP 
4: L 600 + STRUT 
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7 

A FOOT 

« 3% L 700 

800 


Figure 11.3. Vowel plots for Brian, aged 17, Afro-Caribbean, Hackney 


Figure 11.3 shows the same vowels for a young inner-city Londoner, who is rep- 
resentative at least of speakers of Afro-Caribbean and West African ethnicities. 
The diphthongs, especially FACE, PRICE, and GOAT, have much shorter trajec- 
tories, and FACE and GOAT are now high peripheral vowels. GOAT in fact shows 
a development which is the opposite of that found in Milton Keynes (and elsewhere 
in the south-east of England). The study showed that, in fact, all ethnicities, includ- 
ing white Anglos, could variably use this system, and that the strongest predictors 
were residence in the inner city (rather than the suburbs) and a friendship network 
which was highly multiethnic. 

Contact, in this case, is in the first instance with the languages of the immigrants. 
However, the changes in London English as a whole are not driven directly by 
second-language speakers, but by the L1 English of their immediate offspring, 
which may well contain copies of L2 features such as near-monophthongal FACE 
and GOAT. It is, in other words, another case of dialect contact. 

The question arises as to why this has happened now, and not at an earlier 
period. The answer seems to be that the rate of immigration in recent years 
has led to inner-city communities where children grow up learning English, but 
with limited access to L1 models. Instead, their models are L2-speaking adults 
and older children who themselves did not have an L1 model. Children from 
English-speaking families (of whatever ethnicity) also grow up in this environ- 
ment, and acquire many of the features. This outline suggests a deterministic model. 
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However, there is plenty of evidence that these varieties are strong social and 
identity markers (e.g. Rampton, forthcoming). 


7 Conclusion 


In this chapter, I have attempted to show that new-dialect, or koine formation is 
a process which can be divided into several stages, at each of which speakers engage 
in certain behaviors and where social factors variably play a part. The overall model 
is broadly, but not completely, deterministic. Even if outcomes are highly pre- 
dictable, many different factors come into play before the outcome is achieved. 
Some of these, in the end, have no bearing on the outcome, while others clearly do. 


NOTES 


The term “koine” was first used to denote the form of Greek used as a lingua franca 
during the Hellenistic and Roman periods (see Siegel 1985, who lists a number of 
languages which have been referred to as koines). In this chapter, we restrict ourselves 
to “immigrant koines,” as well as related forms of language arising following what 
There call “trauma.” 


2 The recordings were made by the Mobile Disc Recording Unit of the National 
Broadcasting Corporation of New Zealand (see Gordon et al. 2004: 3-5: and Trudgill 
2004: x). 

3 New recordings were made by Werdan Kassab in 2009 for his doctoral project. 

4 This work was carried out as part of the ESRC-funded project Linguistic Innovators: The 
English of Adolescents in London, 2004-2007, ref. RES-000-23-0680; investigators Paul Kerswill 
and Jenny Cheshire, research associates Susan Fox and Eivind Torgersen. 
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12 Contact and Change: 
Pidgins and Creoles 


JOHN HOLM 


1 Language Contact and Change 


Creolists have long known that contact speeds up language change. In Addison 
Van Name’s 1869-70 article in the Transactions of the American Philological Associ- 
ation, it is clear that he understood that the pidginization/creolization process 
accelerated change: 


The changes which [creoles] have passed through are not essentially different in kind, 
and hardly greater in extent than those, for instance, which separate the French from 
the Latin, but from the greater violence of the forces at work they have been far more 
rapid . . . here two or three generations have sufficed for a complete transformation. 
(Van Name 1869-70: 123) 


In his 1937 doctoral dissertation, Marginal Languages: A Sociological Survey of the 
Creole Languages and Trade Jargons, John Reinecke also stated this understanding 
unambiguously: 


Among the localities most suitable for special studies are those in which the 
marginal languages are spoken. Changes there have been very rapid and pro- 
nounced. Languages can be observed taking form within a man’s lifetime; and 
occasionally the influence of a few individuals may be traced directly, as for ex- 
ample the work of missionary educators in fixing the form of the Lingala lingua 
franca of the Belgian Congo, or of a few British administrators in discouraging the 
Sudan-Arabic jargon. The same influences that are at work upon any language are 
at work upon the marginal languages, but here they can be seen in a more clearcut, 
not to say exaggerated, fashion. (Reinecke 1937: 6-7) 


The study of pidgins and creoles played a central role in the development of contact 
linguistics, long before that name for this area of study began to be used in the 
1980s. Although pidgin and creole studies (often shortened to creole linguistics 
or just creolistics) did not become a subfield of linguistics in academia until the 
second half of the twentieth century, the history of its development (e.g. Holm 
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2000: 14-67) provides abundant evidence that scholars have long recognized the 
existence of speech that appears to be a distortion of a known language but is 
not a dialect related to it in the usual genetic sense. 


2 Pidgins 


The earliest known example of a pidgin comes from a book written in 1068 CE, 
when the Arabic-speaking Andalusian geographer al-Bakri cited a traveler who 
had been in Maridi in what is now Mauritania: “The Blacks have mutilated our 
beautiful language and spoiled its eloquence with their twisted tongues”; he then 
provided 10 sentences of their speech (Thomason & Elgibali 1986). 

This reference to pidginized Arabic predates the first known references to the 
pidgin called Lingua Franca, which had probably been in use along the eastern 
coast of the Mediterranean even earlier. The Europeans who came into contact 
with Arabs and Turks during the crusades were known generically as Franks to 
the Levantines, whence the name of this pidgin based on southern Romance 
languages — primarily Italian and Provengal. However, the variety later spoken 
farther west in Algiers had more words derived from the Iberian languages, 
especially after a military defeat of the Portuguese in 1578 led to a massive influx 
of captives (Whinnom 1977: 299). After the French conquered Algeria in the 1830s 
the local Lingua Franca lexicon was drawn increasingly from French until it 
gradually became the nonstandard French of that area (Schuchardt 1909). 

Here is a sentence in Lingua Franca which illustrates some of the basic features 
of pidgins: 


(1) Peregrin taybo cristian, si querer andar Jordan, pilla per tis _jornis 
pilgrim good Christian if want go Jordan take for your journey 
pan, que no trobarpan ni vin. 
bread [there] not find bread nor wine 
(Ensina 1521, quoted by Harvey, Jones, & Whinnom 1967) 


If one compares this sentence to its equivalent in a Romance language like 
Spanish, it is immediately striking that most grammatical inflections are missing 
in Lingua Franca, e.g. the Spanish infinitive form querer is used instead of the 
inflected form quieres ‘[you] want’. Lingua Franca nouns and verbs were usually 
invariable, although speakers of Romance languages sometimes used inflections 
for gender and number from their native languages, and these inflections may 
sometimes have been imitated by speakers of other languages. Usually there 
was no agreement with adjectives, e.g. moro namorada ‘Moorish girlfriend’ (cf. 
Portuguese namorada moira). Only one bound morpheme became stabilized: -ato, 
a past participial ending used to form the past tense as in fi mirato ‘you saw’. 
Like all pidgins, Lingua Franca was a reduced language that resulted from 
extended contact between groups of people with no language in common. It evolved 
because they needed some means of verbal communication, e.g. for trade, but 
no group learned the native language (e.g. Italian, as Italians spoke it among 
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themselves) for social reasons that probably included lack of close contact due to 
lack of trust. In such situations the people with less power (speakers of substrate 
languages, by the definition used in creolistics) are more accommodating and do 
the difficult work of learning the other group’s vocabulary. However, those with 
more power (the superstrate speakers) adopt many of the substrate speakers’ 
changes in their language regarding pronunciation, grammar, and the meaning 
of vocabulary, and no longer try to speak it as they would within their own group. 
They cooperate with the other group (or groups) to construct an emergency lan- 
guage that will serve their needs, simplifying by dropping unnecesary complica- 
tions such as inflections (e.g. two knives becomes two knife) and reducing the number 
of different words that they use, but compensating by extending their meanings 
or using circumlocutions. By definition, the resulting pidgin is restricted to a limited 
use (e.g. trade) and it is no one’s native language. Furthermore, the languages 
involved must be typologically distant from each other (otherwise a different kind 
of language mixing would result, akin to koineization or dialect leveling) and the 
different language groups must maintain their social distance (or else they or their 
descendants would eventually learn each other’s languages perfectly). 

The sentence above in Lingua Franca from Ensina 1521 is actually a literary 
artefact, as the repeated rhyme suggests. The following is a spontaneous use of 
the pidgin English of Papua New Guinea, called Tok Pisin, recorded by Margaret 
Mead around 1936 and reproduced in Hall (1966: 149): 


(2) naw mi stap rabawl. mi stap long brglajn. mi katim kopra. naw wanfela 
Then I stay Rabaul. wasin workgroup.I cut copra. Thena 
mastar bilong kampani em i-kicrm mi. mi kuk long em gen. 


white-man from company he take me.I cook for him again. 
mastar king. 
Mister King. 


One of the most striking features of this text is the absence of complex sentences 
with embedding. However, this recording was made when Tok Pisin was not yet 
widely spoken in an expanded form. Today embedded structures such as rela- 
tive clauses are found in the speech of both native and nonnative speakers of Tok 
Pisin. Here is such a sentence: 


(3) Mania we yu toktok long en em _ spak. 
man DEM REL 2SG talk ADP 3SG 3SG be-drunk 
‘The man you were talking to is drunk.’ (Faraclas 2007: 362) 


Note that the structure of the relative clause (underlined) is unlike English: even 
though the object of the preposition (or adposition) long is the relativizer we (from 
dialectal English) referring to man, which is set off from the relative clause by 
the demonstrative ia (from here), the object of long is repeated by the third person 
singular personal pronoun en ‘him’, literally ‘Man here, who you were talking to 
him, he’s drunk.’ 
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3 Creoles 


Lingua Franca was widely used for many centuries and it probably had an impact 
on the way Europeans came to view communicating with non-Europeans. The 
Portuguese explorations that began in the fifteenth century led to the expansion 
of their commercial networks first to Africa, and then to Asia and the Americas 
— and they were later followed by other Europeans. This led to the emergence 
of the first known creole language, the restructured Portuguese of the Cape 
Verde islands, which were uninhabited until their settlement began in 1462. 
During the following century the Portuguese went on to establish outposts along 
the coasts of Africa, Asia, and Brazil. A language sample collected near the 
Portuguese fort at Mina on Africa’s Guinea coast in the 1550s included Lingua 
Franca words such as molta ‘much’ from Italian, and taybo ‘good’ from Arabic 
(Goodman 1987). 

The earliest known attestation of any creole language is from Martinique, 
dated 1671 (Carden et al. 1991: 5, 7). It includes unequivocal features of modern 
Caribbean Creole French such as the preverbal anterior marker té and the post- 
nominal determiner /a: 


(4) Moité tini peur bete la manger monde. 
1s ANT have fear animal DET eat people 
‘I was afraid the animal ate people.’ 


The most remarkable thing about this text is that it demonstrates the speed with 
which this new language took form: the French did not settle Martinique until 
1635, so the creole must have emerged during the first 36 years of settlement. 

The earliest known text of an English-based creole dates from 1718; it is in Sranan, 
spoken in Suriname on the northern coast of South America. The passage was 
published in J. D. Herlein’s Beschryvinge van de volks-plantinge Zuriname (repro- 
duced in Rens 1953: 142). The English had settled Suriname in 1651 but traded 
it to the Dutch for New Amsterdam (later New York) in 1667. These sentences, 
written in an orthography resembling that of Dutch, were intended to help new 
settlers from Holland learn the language used there: 


(5) a. Oudy. Howdy. 
b. Oe fasje joe tem? How fashion you stand? 
c. My bon. Me good 
d. Joe bon toe? You good too? 
e. Ay. Aye. 


A present-day descendant of early Sranan is Ndjuka. The ancestors of the modern 
speakers were slaves who escaped from the coastal plantations and established 
their own society in the interior rainforests during the half-century after the above 
text was written. The following Ndjuka text is from Park (1975): 
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(6) a. Mibe go aonti anga wan dagu fu mi. A be wan bun 
I had gone hunting witha dog of mine. He wasa good 


onti dagu. 
hunting dog. 

b. Da _ fa miwaka  so,a_ tapu wan kapasi naa olo. 
Then as I walked so, he cornered an armadillo in the hole. 
A lon go so,a_ tyai wan hekon na a_ olo. 


He ran away so, he brought a _—_capybara into the hole. 


Note that unlike the early Tok Pisin text in (2) above (recorded by Mead ca. 1936), 
this creole text has complex sentences, such as the one with the embedded sub- 
ordinate clause beginning fa mi waka so. 

By definition, a creole has a pidgin — or a pre-pidgin jargon without norms — in 
its ancestry. It is spoken natively by an entire speech community, often one whose 
ancestors were displaced geographically so that their ties with their original 
language and sociocultural identity were partly broken. Such social conditions 
were often the result of slavery. For example, from the seventeenth to to the 
nineteenth century Africans of diverse ethnolinguistic groups were brought 
by Europeans to their colonies in the New World to work together on sugar 
plantations. For the first generation of slaves in such a setting, the conditions were 
often those that produce a pidgin. Normally the the Africans had no language in 
common except what they could learn of the Europeans’ language, and access to 
this was usually very restricted because of the social conditions of slavery. The 
children born in the New World were usually exposed more to this pidgin — and 
found it more useful — than their parents’ native languages. 

Since the pidgin was a foreign language for the parents, they probably spoke 
it less fluently; moreover, they had a more limited vocabulary and were more re- 
stricted in their syntactic alternatives. Furthermore, each speaker’s mother tongue 
influenced his or her use of the pidgin in different ways, so there was probably 
massive linguistic variation while the new speech community was being established. 

Although it appears that the children were given highly variable and possibly 
chaotic and incomplete linguistic input, they were somehow able to organize it 
into the creole that became their first language, an ability that may be an innate 
characteristic of our species. This process of creolization or nativization (in which 
a pidgin acquires native speakers) is still not completely understood, but it is thought 
to be the opposite of pidginization, i.e. a process of expansion rather than reduction 
(although a pidgin can be expanded without being nativized). 

For example, creoles have phonological rules (e.g. assimilation) not found in 
early pidgins. Creole speakers need a vocabulary to cover all aspects of life, not 
just one domain like trade. Where words were missing, they were provided by 
various means, such as innovative combinations. For example, the Jamaican 
Creole word han-migl (from English hand + middle) indicates the palm of the hand. 
Of course, this may also have been a calque or word-for-word translation of an 
expression in an African language. For many linguists, the most fascinating 
aspect of this expansion and elaboration was the reorganization of the grammar, 
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ranging from the creation of a coherent verbal system to complex phrase-level 
structures such as embedding. 

Where did the creoles’ grammatical features come from? Various theories have 
been offered to explain the sources of creole language structures; most often these 
point to the creoles’ superstrates and/or substrates, but also various kinds of lan- 
guage universals. Up to now there has been scant empirical evidence to support 
these hypotheses, even though this has been one of the main concerns of creole 
linguistics for over a century. However, there is now a study based on a survey 
of 97 structural features in 18 creole languges spoken all over the world (Holm 
& Patrick 2007). One of these creoles is the restructured Portuguese spoken in Guiné- 
Bissau; a separate study (Holm & Intumbo 2009) compares these 97 features in 
the creole with the corresponding structures in its superstrate (Portuguese) and 
one of its substrate languages (Balanta). This combination of data is not usually 
obtainable, but Intumbo speaks both the creole and Balanta as his first languages. 
The creole features surveyed in Holm and Patrick (2007) in the chapter on the 
closely related creoles of Cape Verde and Guiné-Bissau (Baptista, Mello, & Suzuki 
2007) were compared to the corresponding structures in the superstrate and sub- 
strate: the percentage of matches were found to fall into the following groups: 


Percentage of creole, superstrate, and substrate features in various groups: 
Group 1: Feature totally absent (-Creole, —Portuguese, —Balanta) 11.2% 


Group 2: Miscellaneous (e.g. —Creole, +Portuguese, —Balanta) 5.1% 
Group 3: Feature found in the creole only (+Creole, —Portuguese, 

—Balanta) 9.2% 
Group 4: Feature found in the creole and its superstrate (+Creole, 

+Portuguese, —Balanta) 11.2% 
Group 5: Feature found in the creole and its substrate (+Creole, 

—Portuguese, +Balanta) 29.6% 
Group 6: Convergence (+Creole, +Portuguese, +Balanta) 32.7% 
Total: 99.0% 


The above figures provide the least support for a non-theory: that the relation- 
ship between the creoles and their possible sources is random. Support for the 
influence of universals of adult second language acquisition (e.g. the encoding 
of grammatical information in free rather than bound morphemes) is difficult to 
determine because the same phenomenon is characteristic of many substrate 
languages. Group 1 probably reflects a bias in the original selection of substrate 
features (“Kwa” rather than West Atlantic), and Group 2 seems to be statistically 
unimportant. However, evidence of possible creole-internal innovation can be found 
in Group 3, although it makes up less than 10 percent of the features in this survey. 
However, this is not much less than the evidence of possible influence from the 
superstrate only (Group 4). It is the evidence of possible influence from the sub- 
strate (Group 5) that characterizes nearly 30 percent of the features and makes 
up the largest single category except for convergence (Group 6), which slightly 
exceeds it. But if we consider the last three categories together — superstrate and 
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substrate influence combined with convergence — we have a clear answer: they 
account for the overwhelming majority of the features (73.5%). 

Finally, some related concepts used in creole studies should be mentioned. In 
some areas where the speakers of a creole remain in contact with its lexical donor 
language (e.g. in Jamaica, where English is the official language) there has been a 
historical tendency for the creole to drop its most noticeable non-European features, 
often (but not always) replacing them with European ones — or what are taken 
to be such. This process of decreolization can result in a continuum of varieties from 
those furthest from the superstrate (the basilect) to those closest (the acrolect), with 
mesolectal or intermediate varieties between them. After a number of generations, 
some varieties lose all but a few vestiges of their creole features (those not found 
in the superstrate) through decreolization, resulting in post-creole varieties. 


4 Partial Restructuring 


African American English and nonstandandard Brazilian Portuguese were believed 
until recently to be post-creole varieties, but there is a growing consensus that 
they are actually semi-creoles or partially restructured vernaculars (Holm 2004). 
This means that even though they have both creole and noncreole features, it does 
not imply that they themselves were ever basilectal creoles. As a matter of fact, 
in the areas where African American English (AAE) and Brazilian Vernacular 
Portuguese (BVP) emerged, there is a lack of convincing evidence of the existence 
of a widespead creole as distant from its lexical source language as Sranan or Ndjuka 
is from English (see above). The explanation appears to lie in differing demographics, 
which created differing sociolinguistic conditions. If more than a small minority 
(20-25 percent) of a colony’s founding population are native speakers of the 
colonists’ language, something other than full creolization results. Nonnative 
speakers (usually the non-Europeans during the first century) have more access 
to native speakers of the language they need to learn, and so they can learn 
more of its structure, resulting in a new variety with a substantial amount of the 
superstrate’s morphology and syntax intact, including the inflections usually not 
found in basilectal creoles, but also with a significant number of the structural 
features of a creole, such as those inherited from its substrate or the interlanguages 
that led to its preceding pidgin. 

Partial restructuring occurs when people with different first languages shift to 
a typologically distinct target language (which is itself an amalgam of dialects 
in contact and may include fully restructured pidgins and creoles) under social 
circumstances that partially restrict their access to the target language as normally 
used among native speakers. The theoretical model proposed in Holm (2004) is 
that the linguistic processes producing a semi-creole include the following: (1) dialect 
leveling, preserving features that may be archaic or regional or socially restricted 
in the superstrate; (2) language drift, following internal tendencies within the super- 
strate, such as phonotactic, morphological or syntactic simplification; (3) imper- 
fect language shift by the entire population, perpetuating features from ancestral 
languages or learners’ interlanguages in the speech of monolingual descendants 
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(cf. Thomason & Kaufman 1988: 38-9, 48); and (4) borrowing features from fully 
pidginized or creolized varieties of the target language spoken by newcomers, or 
found in nearby areas where sociolinguistic conditions were favorable to full restruc- 
turing. There may also be (5) secondary leveling, corresponding to the decreolization 
which full creoles can undergo. 

There are at least 200,000,000 speakers of semi-creole language varieties, which 
also include some lects of nonstandard Caribbean Spanish, both standard and non- 
standard varieties of Afrikaans, and some lects of the vernacular French spoken 
on the island of Reunion in the Indian Ocean. This group would appear to include 
a number of the vernacular varieties of European languages spoken as mother 
tongues in Africa, such as the nonstandard Angolan Portuguese of Luanda 
(Inverno 2009), as well as some varieties of English spoken by American Indians 
and Aboriginal Australians. Of course gradient phenomena defy neat separation 
into distinct categories, but certain traits, such as even the marginal status of 
gender agreement within the Reunionnais noun phrase (Holm 2004: 108-10) 
suggest the inappropriateness of categorizing such varieties as fully creolized 
languages, and there is general agreement that they are also distinct from unre- 
structured overseas varieties (such as Azorean Portuguese or Ontario English). 


5 Creolistics and Contact Linguistics 


Creolistics, including the study of pidgins, creoles, and semi-creoles, has occupied 
a special place within contact linguistics. Although interest in language contact 
goes back a number of centuries (Winford 2003: 6), it was the scientific focus on 
radically restructured languages that could be said to begin with Schuchardt in 
the nineteenth century that developed into the core of the field. Weinreich (1953) 
made it clear how broad the scope of the field should be, but much of the rest of 
the twentieth century was spent by creolists trying to keep their studies confined 
to what they could agree were true pidgins and creoles — precisely because there 
was so much disagreement among them when it came to defining the object of 
their study. Since the 1980s, however, the general movement has been to place 
pidgin and creole linguistics within the broader scope of contact linguistics 
(Thomason 1997), including language varieties resulting not only from the kind 
of restructuring associated from pidginization and creolization (to whatever 
degree) but also those resulting from such processes as intertwining (Bakker 
& Muysken 1994), koineization, or indigenization (Siegel 1997). Such studies 
promise to increase our understanding of the range of possible outcomes of 
language contact, and this understanding will surely shed new light on many kinds 
of restructuring. For example, Siegel (1997) examined immigrant koines (e.g. over- 
seas Hindi), indigenized varieties (such as Singapore English), and even renativized 
Hebrew, concluding that the adoption of features in the leveling that took place 
in all of these was affected by certain common factors: frequency, regularity, salience, 
and transparency. This suggests a solution for the long search for principles 
that guide the selection of substrate features into pidgins, creoles, and partially 
restructured varieties during their genesis, and their adoption of other features 
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during their later development: the likelihood of a lexical or structural feature 
being selected is greater if it is more frequent, more regular, more salient, or more 
transparent. It is clear that researchers working in different areas of contact 
linguistics need to keep abreast of each other’s work so that the insights gained 
in one area can be tested in other areas of our field. 
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13 Scenarios for Language 
Contact 


PIETER MUYSKEN 


1 Introduction 


Languages do not exist in an ecological vacuum. The lives of the people who speak 
a language influence its very nature and properties in many ways. Often these 
people also speak other languages, and then these languages may exert an 
influence on it as well. This influence we call language contact. 

Language contact has become an important topic in the study of language, 
due to developments both in society and in scholarship. In many countries, there 
has been a tremendous increase in the number of migrants, leading to increas- 
ingly multilingual societies. As techniques for recording, storing, and analyzing 
spoken language data have developed and improved, it became clear that 
multilingualism had an effect on the languages in question, an effect that could 
be studied systematically. Starting with the work of Uriel Weinreich (1953) and 
Einar Haugen (1950), several generations of scholars have contributed to the study 
of language contact, and a growing body of knowledge in this area has been 
gathered, with both detailed case studies and complex analytical frameworks. 
There is little consensus on the precise framework needed, but substantial agree- 
ment on aims, methods, and relevant data. 

While the study of language contact has been concerned particularly with 
contemporary speech communities, it also has had its influence on the study of 
language history, in the field of historical linguistics. There was a time when 
language contact was assumed to have had a limited role in the historical devel- 
opment of a language (cf. Lightfoot 1979, for example). This view has changed 
considerably in recent years, due to a number of findings. The geo-linguistic 
study of areal typology, initiated by Nichols (1992) and with a recent high point 
in the World Atlas of Linguistic Structures (WALS, Haspelmath et al. 2005), has shown 
that the structural features of languages are to an important degree determined 
by the area in which they are spoken (cf. also Haspelmath 2001). Consequently, 
the recent research agenda of typologists is increasingly concerned with areal 
issues. 
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The study of language contact settings has revealed that there are a consider- 
able number of “mixed languages” of various types, spoken in different parts of 
the globe (cf. Bakker & Mous 1994; Smith 1994; Thomason 1997). 

Dixon’s (1997) punctuated equilibrium model has questioned the widespread 
usefulness of the family tree model. Even though his proposals have met with 
criticism, the view that wave models have an important role to play next to tree 
models has become widespread in historical linguistics. 

Work on reduction processes has shown that diffusion and leveling can have 
important structural consequences (Kusters 2003; Trudgill 1986). In the extreme 
case, many creoles share important structural features (McWhorter 1998), even 
though there is disagreement about the possibility of defining a Creole language 
as a type in typological terms. 

As it is developing into a global discipline and interacting with human sciences 
such as human evolutionary biology, cognitive science, archaeology, anthro- 
pology, and ethno-history, historical linguistics meets a number of challenges, but 
also occupies an increasingly central role. 


1 Now that a number of major language families have been established, deep 
time relationships between possibly distantly related language families are 
explored. 

2 Inseveral parts of the world (New Guinea, the Amazonian fringe) the diver- 
sity encountered does not seem amenable to direct interpretation in terms of 
the classical family model. 

3 A number of languages are difficult to classify genetically. 


On the whole there is widespread consensus among historical linguists concern- 
ing the primary status of the comparative method in historical-comparative lin- 
guistics, a method which does not take contact-induced change directly into account. 
The comparative method has been applied successfully to a number of language 
families all over the world, and proven its worth time and again: it is possible, 
using careful methods, and rigorously applying the principle of the regularity of 
sound change, to establish links between languages and to outline the contours 
of their linguistic ancestors, through reconstruction. There are also some limita- 
tions to the method, however. 

Harrison (2003: 213) suggests that there are four potential areas that may 
involve limitations to the comparative method. Temporal limitations concern the 
time depth at which the comparative method may be applied. Socio-historical 
limitations concern external factors that make application of the comparative 
method impossible or difficult. Limitations as to the linguistic domain concern 
the specific components of language, namely those aspects involved in the sound 
structure of language, that the method has been operating with, along with its 
lexicon and morphology. There are limitations on clarity and/or precision which 
depend on the extent to which one can determine correspondences with sufficient 
and convincing detail. 
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With respect to temporal limitations, Rankin (2003: 207) reflects the scholarly 
consensus that “well-studied language families such as Indo/European, Uralic, 
and Afro/Asiatic suggest that our methods may be valid to a time depth of at 
least 10,000 years.” This automatically means, of course, that relations with a longer 
time depth will not be recognized very well. 


2 A Discrepancy 


There is a discrepancy between findings from the historical linguistic study of 
contact-induced language change and contemporary language contact studies. 
Historical linguists have found few if any constraints on contact-induced language 
change. As Thomason and Kaufman (1988: 14) write: “as far as the strictly lin- 
guistic possibilities go, any linguistic feature can be transferred from any language 
to any other language.” In contrast, language contact specialists have found that 
specific contemporary contact settings are constrained in various ways. Myers- 
Scotton (1993) has found systematic asymmetries between the dominant matrix 
language and the embedded language in insertional code-switching. Similarly, 
Poplack, Sankoff, and Miller (1988) and van Hout and Muysken (1994) have found 
that the frequency by which lexical categories are borrowed follows quite regular 
statistical patterns. 

To resolve the discrepancy mentioned, we need to investigate the linguistic effects 
of language contact at various levels of aggregation and at different time depths. 
In the literature on language contact data from four levels are used, often indis- 
criminately. These levels can not only be distinguished across the dimensions 
“space” (or aggregation level) and “time” (time depth), but also in terms of typ- 
ical data sources and disciplines, and in terms of the extent to which scenarios 
are invoked. 

Disentangling these four levels is necessary as a point of departure. The key 
notion here is “scenario”: the organized fashion in which multilingual speakers, 
in certain social settings, deal with the various languages in their repertoire. This 
is the micro level in Table 13.1. In the specific scenarios studied at this level, specific 
principles or constraints hold concerning what contact-induced change is found, 
depending also on the typological properties of the languages involved. In an ideal 
world what happens at the micro level of the bilingual community should be the 
direct consequence of the behavior of individual bilinguals. Thus, one would hope 
to be able to derive constraints on contact directly from psycholinguistic studies. 
However, psycholinguistic evidence is often difficult to interpret in these terms 
and equally often it is contradictory since multilingual speakers do not exist in a 
sociolinguistic vacuum. 

The different levels of aggregation eventually need to be studied separately to 
see whether we apply insights gained on lower levels to higher levels of aggrega- 
tion so that applicability of results between the different levels can be made the 
specific object of study. 
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Table 13.1 Four levels of aggregation and time depth in studying 


language contact 


Space Time Source Disciplines Scenarios 
Person Bilingual 0-50 Recordings, Psycholinguistics Brain 
individual years tests, connectivity 
experiments 
Micro Bilingual 20-200 Recordings, Sociolinguistics Specific 
community years fieldwork contact 
observations scenarios 
Meso Geographical Generally Comparative Historical Global 
region 200-1,000 data, linguistics contact 
years historical scenarios 
sources 
Macro Larger areas Deep time Typological Areal typology Vague or 
of the world data no contact 
scenarios 


3. Constraints 


Clearly, the question is how we can further the historical and comparative study 
of the languages of the world using insights from outside the domain under study. 
The classical answer from linguistics is to formulate constraints on language change. 
If we can formulate universal constraints on the way languages develop perhaps 
we can better understand complex relations between languages where the com- 
parative method cannot be applied successfully. The key question I want to raise 
in this research proposal is: Can we establish, on the basis of synchronic evidence, 
which way different languages have contributed systemic elements in different 
contact settings? In other words: Can we develop a general view of contact pro- 
cesses which leads to the formulation of constraints? 

In the 1970s and 1980s there was widespread confidence in structural analysis 
and its implications for various sub-disciplines in linguistics. Principle and Para- 
meter approaches to linguistic typology derived from Chomsky’s work (Chomsky 
1981; Baker 1996) inspired confidence in the idea that language variation was 
structured in terms of simple binary choices. In a similar vein, typological research 
inspired by Greenbergian Language Universals (Greenberg 1966) led to the search 
for structural universals (absolute or implicational) in the correlation of linguistic 
features of languages. Language contact studies operated in terms of “constraints,” 
such as the constraints assumed to hold for code-switching (e.g. DiSciullo, 
Muysken, and Singh 1986). The idea that language change, including contact-induced 
change, is constrained by universals thus seemed a plausible assumption. 
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Table 13.2 Constraints on contact-induced change discussed by 
Campbell (1993) 


Page in Constraint proposed 
Campbell 
(1993) 
91 Borrowing only occurs when there is structural compatibility 
94 Borrowing should fit with the innovation possibilities of the 
borrowing language 
96 Grammatical gaps tend to be filled through borrowing 
98 Borrowing only replaces existing material 
99 Freestanding grammatical morphemes are more easily borrowed 
than bound morphemes 
100 Borrowability of elements is based on the ranking of grammatical 
categories 
100 Elements with more than local functional value cannot be borrowed 
101 Borrowing is furthered by the reduction of allomorphy 
101 Non-lexical properties are borrowed after lexical elements 
(Moravesik) 
102 There is no borrowing of bound forms unless words containing 
those forms are also borrowed (Moravcsik) 
102 Verbs cannot be borrowed as such (Moravcsik) 
103 There is no borrowing of inflections without borrowing of 
derivations (Moravcsik) 
104 Grammatical elements cannot be borrowed without their ordering 


properties (Moravcsik) 


3.1 Doubts 


The publication of Thomason and Kaufman (1988) caused a major upheaval 
however, since they argue that the idea that there are intrinsic constraints on 
language contact should be abandoned. In a similar vein Curnow (2001: 434) 
takes a cautious, or if you want, pessimistic view and writes: “It is possible that 
a variety of constraints on borrowing in particular contexts can be developed. But 
the attempt to develop any universal hierarchy of borrowing should perhaps 
be abandoned.” A typical example of the reaction of historical linguists to the uni- 
versal constraints approach can be found in Campbell (1993) where he discusses 
proposed universals of grammatical borrowing extensively (some of which are 
proposed in Moravcsik 1978) and where he is generally skeptical of their universal 
validity. These are listed in Table 13.2. 

In virtually all cases, Campbell is able to furnish counter-examples from 
individual cases of change, while acknowledging that the proposed constraints 
certainly hold as tendencies. 
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3.2 Nichols’ scenario approach to the stability of 
linguistic properties 


A dominant theme in the work of Johanna Nichols concerns the stability of ele- 
ments. Nichols (2003: 285) correctly views stability of linguistic properties or items 
as a relative rather than absolute notion. In her view, it involves four different 
dimensions. The chance that a feature is inherited (due to typological implica- 
tion), the chance that it is borrowed, the chance that it is retained as a substra- 
tum feature in a process of shift, and the chance of its selection due to typological 
pressure. A number of features are presented in her work, on the basis of which 
Table 13.3 was compiled. Following this table, ergativity should be viewed as 
a recessive feature, in Nichols’ view, i.e. it is easily lost, rarely borrowed, not 
often selected. The notion of resonance here refers to the internal coherence of 
a paradigm. The table is compiled on the basis of empirical data from different 
areas, as well as some deductive reasoning. However, the reasoning behind it 
is not always explicit, and it is not always clear to me what determines the 
evaluation of a particular feature in each of the categories. Nonetheless, it is 
useful to isolate different factors that may be linked to typological or processing 
concerns. 


Table 13.3 Summary of features discussed by Nichols (2003) in terms of 
a scenario model 


Feature Inherit Borrow Substratum Select 
Basic vocabulary H L ? na. 
Pronouns H L L Variable 
Pronouns in general H L ? n.a. 
Resonance in pronouns H H H? H 
Resonance in mama-papa _? H? H? H 
Ergativity L L H? L 
Phonetics H H H Variable 
Sound patterns H L? H? L? 
Canonical syllable H Variable? Variable? Variable? 
Front vowel raising H H H H 
Numeral classifiers NotH NotH ? Nil 
Gender NotH  L ? Nil? 
Inclusive—exclusive H Appreciable High? L 

SOV H H H? H 

SVO H? SomewhatH ? 2 
Verb-initial L L H? L 


H = high probability; L = low probability 
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3.3. Types of borrowability hypotheses 


Even if we are critical of the idea that there absolute constraints, independent of 
scenario, it still makes sense to see whether we can formulate scenario-specific 
and probabilistic constraints. Remaining in the area of borrowability hierarchies 
(for the moment in a very broad sense), this would give us the following types 
of hierarchies in the lexical domain: 


1 syntactic elements > discourse markers (that > OK) 

2 core vocabulary > non-core vocabulary > animal and plant names > technical 
vocabulary (hand > computer) 

determiners, conjunctions > verbs, adpositions > nouns, adjectives > names 
low numbers > high numbers (fwo > million) 

first, second person pronouns > third, fourth (inclusive) person pronouns > 
simple kinship > complex kinship (sister > cousin) 

basic colors > peripheral colors (white > orange) 


ND O01 ® W 


It may be possible to extend these hierarchies to components of the language 
system as such, but this is more complicated: 


8 subordinate clauses > main clauses 
9 syntax > morphology > lexicon (word order > diminutive > adjective) 
10 phonological organization > phonetic realization (/i/ : /e/ contrast > velar r) 


These hierarchies are drawn from a wide range of semantic and grammatical 
domains, and the list is by no means exhaustive, but it shows the possibilities 
which can be recognized if one sets one’s mind to it. 


4 The Scenario Approach 


How do we proceed then? Curnow (2001: 412-13) makes the important distinc- 
tion between “paths of development” and “resulting situation” in the historical 
study of language contact. Resulting situations of change and contact can be quite 
opaque as to the factors that have brought them about, while individual paths of 
development may be much more transparent. Similarly, exceptional cases are not 
always separated clearly enough from typical ones. Contact-induced language 
change needs to come out of the “Raritatenkabinett,” i.e. throw off its image as 
a batch of oddities. Odd results of language contact were brought in as trophies 
to be shown to incredulous colleagues, just as in the era of European colonial expan- 
sion odd objects, preserved plants, sea shells, and stuffed animals were brought 
back to be shown to friends and business relations. The strategies to be adopted 
to counter this tendency are: 
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1 achieve a sense of which patterns are frequent and common and which are 
rare and unexpected; 

2 within the rare and unexpected patterns, try to establish the elements which 
are still regular and expected; 

3 try to argue from the multilingual individual > to the speech community > to 
a language > to a larger region 


Given this notion of scenario, the opposition between social and structural factors 
in language change (and consequently, contact-induced language change) is a false 
one (cf. Weinreich, Labov, and Herzog 1968): every change is both structurally 
and socially embedded. In the following schematic overview, various scenarios 
are contrasted with respect to the frequency with which they have been found to 
occur, the symmetry configuration between the languages involved, the linguistic 
features affected in this scenario, and possible constraints on the process of 
contact-induced change. 


4.1 Borrowing 


Borrowing refers to the spread of individual language items from one language 
or speech community to another. Key elements borrowed are words, and in the 
wake of words, associated derivational morphological elements and idiomatic 
meanings of phrases. 


Frequency: Highly frequent, in fact almost universal 

Symmetry configuration: Asymmetrical: from a dominant superstrate to a socially 
subordinate language 

Features involved: Relatively concrete features, generally “fabric” (audible word forms 
rather than abstract patterns). 

Constraints: Subject to various structural constraints, resulting from the need to 
preserve paradigmatic and syntagmatic constraints in the recipient language; 
these result in borrowability hierarchies as their outward manifestation 

Selected references: Poplack, Sankoff, and Miller (1988); van Hout and Muysken 
(1994) 


4.2 Grammatical convergence under prolonged 
stable bilingualism 


Similarly, the study of bilingual language use and convergence has so far not yielded 
a clear picture. While the bilinguals in Gumperz and Wilson’s (1971) study in 
Kupwar — a village in India with different castes speaking different languages — 
showed convergence between Indo-European and Dravidian languages to the point 
of virtual identity between the languages, Pousada and Poplack (1982) argue that 
there is no case for structural convergence between Spanish and English among 
Puerto Rican immigrants in New York. Silva Corvalan (1986) argues for limited 
convergence among the same two languages in Los Angeles. Again, the situations 
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differ in several ways: time depth and type of contact between the languages, 
identity of the speakers, etc. 


Frequency: Relatively frequent, though not as frequent as borrowing 

Symmetry configuration: Potentially symmetrical, depending on the bilingual com- 
munity and the status of its languages 

Features involved: May lead to surface convergence, e.g. in semantic categories and 
in word order 

Constraints: We do not know what constraints there are on the process; it may be 
structurally quite pervasive over time 

Selected reference: Emeneau (1956) is the classical historical study 


4.3 L2 learning, shift, and substrate formation 


A similar discrepancy of results is found in the study of second language (L2) develop- 
ment, and in the extent to which first language (L1) features can survive in a 
newly acquired L2, eventually as a historical substrate. Klein and Perdue (1997) 
assume that the Basic Variety, a very basic early stage in second language develop- 
ment, results from L1-free, basically unmarked parameter settings. In contrast, 
adherents of the Full Transfer or Conservation Hypotheses, such as Schwartz 
and Sprouse (1996) and van de Craats et al. (2000; 2002) assume that in this 
stage basically L1 parameter settings hold. In this case, we cannot directly point 
to external factors to explain the discrepancy of results, since some of the very 
same subjects were involved in some of the studies. 


Frequency: Frequent 

Symmetry configuration: Asymmetrical — from a subordinate language to a socially 
dominant language 

Features involved: Relatively abstract language features from the L1, morphological 
and morphophonemic distinctions from the L2 

Constraints: Often the transfer of L1 features is stabilized because there is an at 
least superficially similar feature already present in the L2 target. Transfer also 
tends to involve relatively abstract rather than concretely “visible” features 

Selected references: Von Wartburg (1939) for substrate; Schwartz and Sprouse (1996); 
van de Craats et al. (2000; 2002) for L1 transfer; Hickey (2007) for focusing of 
contact-induced shift varieties in unguided adult second language acquisition 


4.4 Relexification 


It should be borne in mind that even processes such as relexification are far from 
unitary; ie. what has been counted in the literature as the result of relexification or 
intertwining actually varies with regard to several different issues (Muysken 2006): 


1 the issue of the matrix language: with Michif, Media Lengua, etc. the tradi- 
tional community language acts as a matrix, while in the Copper Island Aleut 
and the Australian cases, the “new” language acts as the matrix; 
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2 the issue of convergence: in the mixed languages in the Malay sphere we find 
widespread convergence while in the case of Media Lengua and Michif con- 
vergence has been very moderate; 

3 typological issues: agglutinative/polysynthetic, head-marking/dependent 
marking/richness of morphology. 


To take a particular example, consider Media Lengua, where highly agglutina- 
tive Quechua was the matrix language and there was only limited convergence 
(Muysken 1981; 1997): 


(1) algunos-wa-lla sinta-n asi no undi-muni-na _ kiri-k-kuna-lla 
some.PL-DIM-DEL sit-3 thus NEG where-dir go-NM want-AG-PL-DEL 
‘Only a few stay. Those that do not want to go anywhere.’ 

[Algunitos no mas pasan alli. Los que no quieren irse a ninguna parte.] 


The anomaly in Media Lengua is that the (italicized) roots all have Spanish phono- 
logical shapes (many with Quechua meanings). At the same time, what are 
expected in Media Lengua are the content word/functional element asymmetries, 
and the phonology of (highly frequent) Spanish borrowings in Quechua. 
Auxiliary hypotheses to explain the anomaly would include patterns of shift, lan- 
guage games and conscious creations, doubling in bilingual songs, redefinition 
of the language-ethnicity relation. 


Frequency: Highly infrequent in a stable form 

Symmetry configuration: Clearly asymmetrical; word shapes from one language are 
grafted onto lexical structures of the matrix language, which also provides the 
grammar and most of the phonology 

Features involved: Word shapes 

Constraints: A highly constrained process, with only limited structural consequences 

Selected references: Muysken (1981, 1997, 2006) 


4.5 Leveling 


Leveling involves the selective simplification and homogenization of patterns as 
they spread from one community or area to another, as in the case of the differ- 
ent varieties of North Indian languages which were spread across the globe due 
to contract labor practices, resulting in varieties like Hindustani. 


Frequency: Relatively well-attested 

Symmetry configuration: No assumption of asymmetry, although some varieties 
involved in the leveling process may be more important than others 

Features involved: Mostly morphological, lexical, and phonological features 

Constraints: Generally not a process that affects deeper levels of grammatical 
organization 

Selected reference: Siegel (1988) 
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4.6 Simultaneous acquisition of two languages 
by children 


In bilingual first language development, scholars like Genesee (1989) and Meisel 
(1994) have found evidence for the Autonomous Development Hypothesis, 
which claims quite separate development for the two languages of a child, while 
e.g. Sanchez (2003) has found evidence for functional interference and convergence. 
Miiller and Hulk (2001) take an intermediate position, assuming that interference 
will involve the interface between different components. Notice that, independently 
of the processing constraints, which may be formulated in terms of interface 
conditions, the sociolinguistic types of children studied by Sanchez and Meisel 
differ radically: bilingual Quechua- and Spanish-speaking peasant and rural 
migrant children in Peru, versus German- and French-speaking middle-class 
children in Hamburg. 


Frequency: A relatively frequent phenomenon on the individual level. However, 
not so frequent at the community level 

Symmetry configuration: In principle symmetrical, although in actual practice the 
preferred language of the mother often will play a dominant role, particularly 
if this is also the community language 

Features involved: Features typically involved in convergence during the acquisi- 
tional process are pragmatic properties of particular constructions, as well as 
some morpho-syntactic properties 

Constraints: On the whole, basic structural, semantic, and phonetic properties of 
the different languages involved tend to be preserved 

Selected reference: Romaine (1995) 


4.7 Metatypy or restructuring 


Metatypy (Ross 1999) or contact-induced restructuring (Winford 2003) is an as 
yet not well-understood type of contact-induced language change. According to 
Ross (1999: 7) metatypy is a “change in morphosyntactic type and grammatical 
organization (and also semantic patterns) which a language undergoes as a result 
of its speakers’ bilingualism in another language.” Ross has based his initial 
definition of metatypy on research on Karkar island in the Pacific, noticing 
that the western Oceanic language Takia has undergone significant structural 
influences from the dominant neighboring Papuan Waskia language, presumably 
from the Trans-New Guinea Phylum. Contact has been so intense that words in 
the two languages can now be matched one by one: 


Frequency: In Ross’ own work, about dozen cases are mentioned; the process is 
possibly relatively frequent 

Symmetry configuration: The change is clearly asymmetrical in that a relation of 
social dominance is involved and only one language is affected 
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Features involved: Typical features involved are morpho-syntactic distinctions, 
word order patterns, collocations and phrasal semantics 

Constraints: The consequences of metatypy can be far-reaching, but work 
by Reesink, Singer, and Dunn (2009) suggests that the structural effects 
of metatypy did not obliterate the underlying “Oceanic” structural profile of 
Takia 

Selected references: Ross (1999), Winford (2003) 


4.8 Insertional code-switching 


Muysken (2000), looking at code-switching and code-mixing and summarizing 
a large body of literature, has argued that there are several types of switching 
involved: alternation, insertion, and congruent lexicalization. Factors determin- 
ing what type of switching or mixing a bilingual community will adopt are speaker- 
related (e.g. bilingual competence), social (prestige of the languages, type of 
interaction), and grammatical (typological match, lexical similarities). 

Insertional code-switching is particularly prevalent when speakers know a 
second language less well than their community language, in (post-)colonial or 
immigrant community settings. 


Frequency: Insertional code-switching, particularly on a limited scale, is extremely 
frequent in most speech communities 

Symmetry configuration: By definition the process is asymmetrical, since items 
or small constituents from one language are inserted into the frame of another 
language 

Features involved: Insertional code-switching primarily involves nouns, noun 
phrases, adjectives, adjective phrases, and adpositional phrases 

Constraints: The two main constraints proposed by Myers-Scotton involve the 
maintenance of the integrity of the matrix language, and the only limited 
possibility of inserting functional category elements. In general, grammatical 
congruence between the inserted element and the slot it enters is needed 

Selected reference: Myers-Scotton (1993) 


4.9 Adjunction and alternational code-switching 


In addition to insertional code-switching, there is also alternational code-switching, 
in which elements are more loosely combined, and fewer constraints hold. An 
example comes from the borrowing of prepositions and conjunctions in many Meso- 
American languages (e.g. Suarez 1983: 136). These constitute a counterexample to 
many proposed universal borrowing constraints. However, I argue in Muysken 
(1999) that the borrowing process that had led to the incorporation of these 
elements is distinct from ordinary borrowing processes, not only in its outcome, 
but also in its other characteristics, and these are more like adjunct elements than 
like insertions, such as lexical noun borrowings. 
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Frequency: There is no doubt that alternational code-switching is also frequent, 
particularly in communities where both languages are widely spoken, but also 
in cases where there is political competition between the languages involved 

Symmetry configuration: Symmetrical; the two languages involved are juxtaposed 

Features involved: Adjunction and alternational code-switching on the lexical level 
particularly involve conjunctions, adverbs, and adpositions and, on the phrasal 
and clausal level, adverbial clauses, coordinate clauses, and adverb phrases 

Constraints: The constraints formulated by Poplack (1980) are the Free Morpheme 
Constraint, blocking switching inside the word, and the Equivalence Constraint, 
ensuring parallel word orders in the two languages around the switch site. 
In other circumstances, a peripheral or adjunct status for the element to be 
switched is required 

Selected reference: Poplack (1980) 


4.10 Language attrition and death 


In many communities small languages are subject to attrition and eventually 
language death. In the early stages, the contact may lead to metatypy or restruc- 
turing in the direction of the dominant language, but later on there is progres- 
sive disappearance of grammatical distinctions such as gender and case, loss of 
grammatical and pragmatic distinctions, etc. 


Frequency: While many, literally thousands of, languages are threatened with extinc- 
tion, not all go through a prolonged period of structural attrition 

Symmetry configuration: Clearly this is an asymmetrical process 

Features involved: Attrition involves all features of a language: the lexicon, the 
phonology, the morphology and morphosyntax along with the more complex 
syntactic and stylistic possibilities of a language 

Constraints: There are no constraints, except possibly markedness hierarchies 
that help predict the order in which attrition affects different components of 
a language 

Selected references: Dorian (1981), Nettle and Romaine (2000) 


4.11 Creation of symmetric contact languages or jargons 


In many different parts of the world contact languages have emerged in which 
both contributing partners have provided lexical elements and grammar. These 
contact languages are often the result of incidental trade relations, as in the case 
of Russenorsk, where Russians exchanged fish for agricultural produce along the 
North Cape of Norway (Jahr & Broch 1996). 


Frequency: Many of these languages have emerged and probably disappeared 
without a trace; perhaps a dozen or so have been documented 

Symmetry configuration: In principle these languages are in a symmetrical 
relationship 
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Features involved: There is lexicon from both contributing languages (and often from 
others as well). The grammar either comes from contributing languages, or is 
the result of universal processes of (often paratactic) arrangement 

Constraints: Constraints on these languages are limited morphological complexity, 
limited range of registers, and often not very extensive possibilities for constituent 
and clause embedding 

Selected references: Bakker (1994), Winford (2003) 


5 Conclusions 


What the different scenarios surveyed above show is that multilingual speakers 
do not operate in a sociolinguistic vacuum. Processing constraints on language 
interaction are always mediated by social context. The scenario model allows us 
to confront this result. Rather than trying to abstract away from the historical con- 
text by formulating abstract constraints, concrete scenarios are postulated, with 
well-defined characteristics. The reasoning is thus that a specific linguistic result 
is linked to a historical setting, involving specific people (age, ethnicity, mix) with 
specific languages, languages interacting following specific scenarios, which are 
governed by well-defined processing constraints. 

The scenario approach presented assumes that these scenarios are relatively 
autonomous from one another, and internally homogeneous. This may not 
always be a realistic assumption. Language contact specialists have noted that 
different processes often co-occur. Still the autonomy assumption helps us clarify 
specific situations and distinguish between different processes. 

Notice that I have not included creolization, the creation of new, comprehen- 
sive language systems, such as the Caribbean Creole languages, based on (contro- 
versially: simplified) elements of earlier systems in the present listing. The reason 
is that what we call the class of Creole languages first of all is rather hetero- 
geneous, not very well defined, and in linguistic terms largely restricted to one 
language family (the Indo-European family) and one set of historical circumstances 
(European colonial expansion from ca. 1500 to ca. 1900). 

The list of scenarios given above is not meant to be exhaustive. Still other 
language contact scenarios may be proposed. Their number is not finite and 
various levels of granularity are possible. The extent to which specific settings 
from different parts of the world resemble each other sufficiently to be subsumed 
under the same scenario heading is an empirical issue which needs to be 
resolved. 

In the same vein, larger or much rougher subdivisions are imaginable, such as 
Thomason and Kaufman’s (1988) three-way split into maintenance, borrowing, 
and creation scenarios. In Muysken (forthcoming) four principal strategies are 
identified guiding bilingual language behavior, strategies which combine in dif- 
ferent ways in different contact settings: reliance on L1 knowledge, focus on L2 
knowledge, maximization of compatibility between L1 and L2 knowledge, and 
reliance on universal communicative strategies. 


REFERENCES 


Baker, Mark C. 1996. The Polysynthesis 
Parameter. New York: Oxford University 
Press. 

Bakker, Peter 1994. Pidgins. In Jacques 
Arends, Pieter Muysken, and Norval 
Smith (eds.), Pidgins and Creoles: 

An Introduction. Amsterdam: John 
Benjamins, pp. 25-39. 

Bakker, Peter and Maarten Mous (eds.) 
1994. Mixed Languages. Amsterdam: 
IFOTT. 

Campbell, Lyle 1993. On proposed 
universals of grammatical borrowing. 
In Henk Aertsen and Robert J. Jeffers 
(eds.), Historical Linguistics 1989: 

Papers from the 9th International 
Conference on Historical Linguistics. 
Amsterdam: John Benjamins, 

pp. 91-110. 

Chomsky, Noam 1981. Lectures on 
Government and Binding. Dordrecht: 
Foris. 

Craats, Ineke van de, Norbert Corver, and 
Roeland van Hout 2000. Conservation 
of grammatical knowledge: on the 
acquisition of POSSessive noun phrases 
by Turkish and Moroccan Arabic 
learners of Dutch. Linguistics 38: 
221-314. 

Craats, Ineke van de, Roeland van Hout, 
and Norbert Corver 2002. The 
acquisition of possessive HAVE-clauses 
by Turkish and Moroccan learners of 
Dutch. Bilingualism: Language and 
Cognition 5: 147-74. 

Curnow, Timothy J. 2001. What language 
features can be borrowed. In Alexandra 
Y. Aikhenvald and Robert M. W. Dixon 
(eds.), Areal Diffusion and Genetic 
Inheritance. Oxford: Oxford University 
Press, pp. 412-36. 

DiSciullo, Anne-Marie, Pieter Muysken, 
and Rajendra Singh 1986. Government 
and code-mixing. Journal of Linguistics 
22: 1-24. 


Scenarios for Language Contact 279 


Dixon, Robert M. W. 1997. The Rise 
and Fall of Languages. Cambridge: 
Cambridge University Press. 

Dorian, Nancy C. 1981. Language Death: 
The Life Cycle of a Scottish Gaelic Dialect. 
Philadelphia: University of 
Pennsylvania Press. 

Emeneau, Murray 1956. India as 
a linguistic area. Language 32: 

3-16. 

Genesee, Fred 1989. Early bilingual 
development: one language or two? 
Journal of Child language 16: 161-79. 

Greenberg, Joseph H. 1966. Some 
universals of grammar with particular 
reference to the order of meaningful 
elements. In Joseph H. Greenberg (ed.), 
Universals of Language. Cambridge, MA: 
MIT Press, pp. 73-113. 

Gumperz, John J. and Robert Wilson 1971. 
Convergence and creolization: a case 
from the Indo-Aryan/Dravidian border 
in India. In Dell Hymes (ed.), 
Pidginization and Creolization of 
Languages. Cambridge: Cambridge 
University Press, pp. 151-67. 

Harrison, Shelly P. 2003. On the limits of 
the comparative method. In Brian D. 
Joseph and Richard D. Janda (eds.), 
The Handbook of Historical Linguistics. 
Oxford: Blackwell, pp. 213-43. 

Haspelmath, Martin 2001. The European 
linguistic area: Standard Average 
European. In Martin Haspelmath, 
Ekkehard Konig, Wulf Oesterreicher, 
and Wolfgang Raible (eds.), Language 
Typology and Language Universals 
(An International Handbook), vol. 2. 
Berlin/New York: de Gruyter, 
pp. 1492-510. 

Haspelmath, Martin, Matthew S. Dryer, 
David Gil, and Bernard Comrie (eds.) 
2005. The World Atlas of Language 
Structures. Oxford: Oxford University 
Press. 


280 Pieter Muysken 


Haugen, Einar 1950. The analysis of 
linguistic borrowing. Language 26: 
210-31. 

Hickey, Raymond 2007. Irish English: 
History and Present-Day Forms. 
Cambridge: Cambridge University 
Press. 

Hout, Roeland van and Pieter Muysken 
1994. Modelling lexical borrowability. 
Language Variation and Change 6: 39-62. 

Jahr, Ernst Hakon and Ingvild Broch (eds.) 
1996. Language Contact in the Arctic: 
Northern Pidgins and Contact Languages. 
Berlin: Mouton de Gruyter. 

Klein, Wolfgang and Clive Perdue 1997. 
The basic variety (or: couldn’t natural 
languages be much simpler?). Second 
Language Research 13: 301-47. 

Kusters, Wouter 2003. Linguistic 
Complexity: The Influence of Social 
Change on Verbal Inflection. Doctoral 
dissertation, Leiden University. 

Lightfoot, David W. 1979. Principles of 
Diachronic Syntax. Cambridge: 
Cambridge University Press. 

McWhorter, John 1998. Identifying the 
creole prototype: vindicating a 
typological class. Language 
74: 788-818. 

Meisel, Jiirgen (ed.) 1994. Bilingual First 
Language Acquisition: French and German 
Grammatical Development. Amsterdam: 
John Benjamins. 

Moravesik, Edith A. 1978. Language 
contact. In Joseph H. Greenberg (ed.), 
Universals of Human Language, vol. 1: 
Method and Theory. Stanford: Stanford 
University Press, pp. 94-122. 

Miiller, Natascha and Aafke Hulk 2001. 
Crosslinguistic influence in bilingual 
language acquisition: Italian and French 
as recipient languages. Bilingualism: 
Language and Cognition 4: 1-22. 

Muysken, Pieter 1981. Halfway between 
Quechua and Spanish: the case for 
relexification. In Arnold Highfield and 
Albert Valdman (eds.), Historicity and 
Variation in Creole Studies. Ann Arbor, 
MI: Karoma, pp. 52-78. 


Muysken, Pieter 1997. Media Lengua. 

In Sarah G. Thomason (ed.), Contact 
Languages: A Wider Perspective. 
Amsterdam: John Benjamins, 

pp. 365-426. 

Muysken, Pieter 1999. Three processes of 
borrowing: borrowability revisited. In 
Guus Extra and Ludo Verhoeven (eds.), 
Migrants and Bilingualism: Proceedings of 
the Tilburg Workshop on Language Change 
in Minority Communities, January 1996. 
Berlin: Mouton de Gruyter, pp. 229-46. 

Muysken, Pieter 2000. Bilingual Speech: 

A Typology of Code-Mixing. Cambridge: 
Cambridge University Press. 

Muysken, Pieter 2006. Mixed codes. 

In Peter Auer and Li Wei (eds.), 
Multilingual Communication. Berlin: 
Mouton de Gruyter, pp. 303-28. 

Muysken, Pieter, forthcoming. Modeling 
language contact. 

Myers-Scotton, Carol 1993. Duelling 
Languages. Oxford: Clarendon Press. 

Nettle, Daniel and Suzanne Romaine 
2000. Vanishing Voices: The Extinction 
of the World’s Languages. New York: 
Oxford University Press. 

Nichols, Johanna 1992. Linguistic Diversity 
in Space and Time. Chicago/London: 
The University of Chicago Press. 

Nichols, Johanna 2003. Diversity and 
stability in language. In Brian D. Joseph 
and Richard D. Janda (eds.), The 
Handbook of Historical Linguistics. 
Oxford: Blackwell, pp. 283-310. 

Poplack, Shana 1980. Sometimes I'll start 
a sentence in Spanish Y TERMINO EN 
ESPANOL: toward a typology of 
code-switching. Linguistics 18: 581-618. 

Poplack, Shana, David Sankoff, and 
Christopher Miller 1988. The social 
correlates and linguistic processes of 
lexical borrowing and assimilation. 
Linguistics 26: 47-104. 

Pousada, Alicia and Shana Poplack 1982. 
No case for convergence: the Puerto 
Rican Spanish verb system in a 
language-contact situation, In Joshua 
Fishman and Gary D. Keller (eds.), 


Bilingual Education far Hispanic Students 
in the United States. New York: 
Columbia University Teacher’s College 
Press, pp. 207-40. 

Rankin, Robert L. 2003. The comparative 


method. Brian D. Joseph and Richard D. 


Janda (eds.), The Handbook of Historical 
Linguistics. Oxford: Blackwell, 
pp. 183-212. 

Reesink, Ger, Ruth Singer, and Michael 
Dunn 2009. Explaining the linguistic 
diversity of Sahul using population 
models. Public Library of Science Biology, 
November (online journal). 

Romaine, Suzanne. 1995. Bilingualism, 
2nd edn. Oxford: Blackwell. 

Ross, Malcolm D. 1991. Refining Guy’s 
sociolinguistic types of language 
change. Diachronica 8: 119-29. 

Ross, Malcolm D. 1999. Exploring 
metatypy: how does contact-induced 
typological change come about? 
Keynote address to the meeting of 
the Australian Linguistic Society, 
Perth, October. 

Sanchez, Liliana 2003. Quechua-Spanish 
Bilingualism. Interference and Convergence 
in Functional Categories. Amsterdam: 
John Benjamins. 

Schwartz, Bonnie D. and Richard A. 
Sprouse 1996. L2 cognitive states and 
the full transfer, full access model. 
Second Language Research 12: 40-72. 

Siegel, Jeff 1988. Language Contact in a 
Plantation Environment: A Sociolinguistic 
History of Fiji. Cambridge: Cambridge 
University Press. 


Scenarios for Language Contact 281 


Silva-Corvalan, Carmen 1986. Bilingualism 
and language change: the extension of 
estar in Los Angeles Spanish. Language 
62: 587-608. 

Smith, Norval 1994. An annotated list of 
creoles, pidgins, and mixed languages. 
In Jacques Arends, Pieter Muysken, and 
Norval Smith (eds.), Pidgins and Creoles: 
An Introduction. Amsterdam: John 
Benjamins, pp. 331-74. 

Suarez, Jorge A. 1983. The Mesoamerican 
Indian Languages. Cambridge: 
Cambridge University Press. 

Thomason, Sarah G. (ed.) 1997. Contact 
Languages: A Wider Perspective. 
Amsterdam: John Benjamins. 

Thomason, Sarah G. and Terence S. 
Kaufman 1988. Language Contact, 
Creolization and Genetic Linguistics. 
Berkeley: University of California 
Press. 

Trudgill, Peter 1986. Dialects in Contact. 
Oxford: Blackwell. 

Wartburg, Walter von 1939. Réponses au 
Questionnaire du Ve Congres international 
des Linguistes. Bruges. 

Weinreich, Uriel 1953. Languages in 
Contact. The Hague: Mouton. 

Weinreich, Uriel, William Labov and 
Marvin Herzog 1968. Empirical 
foundations for a theory of language 
change. In Winfred Lehmann and 
Yakov Malkiel (eds.), Directions for 
Historical Linguistics. Austin: University 
of Texas Press, pp. 95-188. 

Winford, Donald 2003. An Introduction to 
Contact Linguistics. Oxford: Blackwell. 


14 Ethnic Identity and 
Linguistic Contact 


CARMEN FOUGHT 


1 Introduction: Ethnicity, Contact, and 
Sociolinguistic Variation 


Since the 1960s, research on sociolinguistic variation has revealed the complex ways 
in which our speech reflects our construction of identity. We now know that small 
quantitative differences in the use of phonetic variables reflect social structure at 
many (sometimes competing) levels, even when speakers themselves are not con- 
sciously aware of these variables or of their own usage. A significant proportion 
of this research has focused on ethnicity and language, a field of study with a 
tradition dating back well before variationist concerns emerged as a research focus. 
Different ethnic groups within a nation or region have frequently granted a cen- 
tral role to a heritage language in defining and uniting the group (cf. Fishman 
2001). As sociolinguists began to learn more about variation within a language, 
they discovered, unsurprisingly, that ethnic differences had a significant influence 
on this micro-level variation as well. 

Labov, in his research on the dialects of New York City (e.g. Labov 1966; 1972b) 
definitively placed ethnicity at the very center of the sociolinguistic tradition.’ From 
a sociological perspective, it makes sense that this should be so. In the United 
States, where Labov’s work was centered, as in most places around the world, 
ethnic distinctions play a crucial role in social structure. A dominant European- 
American group has received social privilege, while other groups have been sub- 
ject to economic and social sanctions ranging from slavery to job discrimination. 
Members of minority ethnic groups are disproportionately represented in lower 
socio-economic brackets, and underrepresented in positions of social power and 
authority. It would be strange indeed if such crucial social distinctions were not 
reflected in language. 

Ethnicity is also distinct from other social factors in certain key ways. Gender, 
for example, has also been studied extensively in the sociolinguistic tradition. Some 
of the same types of discrimination mentioned earlier have affected women as 
well, such as being underrepresented in positions of authority. However, women 
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and men are not routinely segregated from each other, living in different neigh- 
borhoods, for example. People have daily face-to-face contact with others whose 
gender differs from theirs, while they would not necessarily have such contact 
with people from other ethnic groups. Social class, another important variable, 
may be more likely to lead to segregation. But it is also one of the more malleable 
social factors, at least in the United States and many other Western societies. One 
is much more likely to change one’s social class over the course of a lifetime than 
to change one’s gender or ethnicity (although these latter options are simply less 
likely, not impossible, of course). We would expect, then, that given the social 
structures of modern communities, ethnicity would show a strong correlation with 
language, and, in fact, it does. 

Women and men, for example, do not usually speak different languages. In 
contrast, although it is not generally the case in Western societies, there are some 
places where social class differences might involve the use of entirely different 
languages (e.g. in situations of linguistic diglossia). Yet even in these cases, 
ethnicity is often tied to social class in a way that makes it difficult to separate 
the two. In many Latin-American countries, for example, the upper classes may 
speak Spanish, while the lowest classes speak an indigenous language, such as 
Quichua or Yucatec Mayan. At the same time, however, the higher social classes 
consist mostly of Latinos, while members of indigenous groups belong to lower 
socioeconomic groups. Historically, the different languages are tied to differences 
in ethnic group membership, and the correlation with social class is a secondary 
one, resulting from the economic consequences of discrimination based on 
ethnicity. 

It is crucial to keep in mind that ethnicity, as discussed here, is not some 
scientifically grounded and definable entity, but rather the label for a particular 
society’s construction of ethnic differences, and the ideological views associated 
with them. Biologists and other researchers have had no success in establishing 
any sort of scientific basis for racial distinctions. Within a community, however, 
ethnic categories will nonetheless be constructed that correspond to perceptions 
of phenotype and other relevant factors. These categories will be based on his- 
torical and social influences on belief systems, rather than biology, but they still 
play a crucial role in understanding the social structure. It is possible, then, to 
study a community’s ideologies about ethnic identity at a particular place and 
time, and to relate these ideas to language variation. 

In addition, one possible view of ethnicity is that it hinges crucially on ethnic 
contact. Barth (1969), for example, which is a classic sociological work on the topic 
of race and ethnicity, lists four elements in the definition of an ethnic group: 


it is largely biologically self-perpetuating 

it shares fundamental cultural values 

it makes up a field of communication and interaction 

it has a membership which identifies itself, and is identified by others, as 
constituting a category distinguishable from other categories of the same 
order. 


RON 
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It is the fourth element which Barth himself identifies as “critical” (1969: 13). More 
recently, Zelinsky (2001) defines “ethnic group” in the following way: 


The ethnic group is a modern social construct, one undergoing constant change, 
an imagined community too large for intimate contact among its members, persons 
who are perceived by themselves and/or others to share a unique set of cultural and 
historical commonalities . .. It comes into being by reasons of its relationships with 
other social entities, usually by experiencing some degree of friction with other groups 
that adjoin it in physical or social space. (Zelinsky 2001: 44; original in italics) 


Under the definitions proposed by these authors, we could argue that a member 
of an isolated tribe who had never had any contact with anyone outside his or 
her village would, in a sense, lack an ethnicity. Though of course this person would 
have a phenotype and other physical characteristics, these would only become 
relevant in the sense of “ethnicity” when such a person experienced contact 
with individuals from a different group, whose phenotype was perceived as 
dissimilar. 

Even if one subscribes to this theoretical view, the point is largely moot. Most 
individuals in modern societies do have some contact with or knowledge of 
others outside their ethnic group. So the construction of ethnicity normally takes 
place in a context of ethnic contact, and by extension, the linguistic construction 
of ethnic identity takes place in a situation of linguistic contact. Even where there 
is little direct contact between groups, the media provide an opportunity for 
indirect observation of other groups, generally promoting the dominant social 
ideologies about these groups at the same time. As individuals construct their 
identities, they develop linguistic resources for signaling their affiliation with 
certain groups, and their social distance from others. 

Given the centrality of inter-group contact in the construction of ethnic identity, 
we would expect linguistic contact to play a crucial role in language variation 
based on ethnicity. As with so many other aspects of sociolinguistic variation, how- 
ever, the effects of such contact are complex and multi-layered. Theoretically, the 
issue of linguistic contact among ethnic groups raises some interesting questions. 


1 Will such groups tend to converge linguistically over time, by borrowing 
features from each other? 

2 Conversely, will changes in a variety be motivated by a shift away from use 
by other ethnic groups? 

3 Is it possible for both of these types of changes to take place as the result of 
contact between groups? 

4 Will the direction of the shift depend on whether or not a variety is associated 
with the dominant ethnic group in that region? 


From what we know about sociolinguistic variation generally, we might expect 
the answer to depend at least partly on factors specific to the community in ques- 
tion, such as the degree of historical conflict, or whether there are more than two 
ethnic groups involved. 
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The research that we have to date suggests that there is a wide range of vari- 
ation in the possible effects of interethnic contact, at both the community and the 
individual levels. In addition, the effects seem to be more closely tied to social 
and linguistic ideologies than to the nature of the contact itself. As Walt Wolfram 
(cited in Hazen 2000) notes: 


dialect adoption is not a simple matter of who you interact with under what 
circumstances — it’s a matter of how you perceive and project yourself — much 
more capturable in cultural identity schemes than interactional reductionism. 
(Hazen 2000: 126) 


To analyze the effects of linguistic contact, then, we must understand the context 
in which speakers in a community construct their own ethnicity, as well as the 
ideologies that affect how they view other groups. 


2 Diverse Settings, Diverse Patterns of 
Convergence 


Taken as a whole, the research on ethnic identity and linguistic contact shows 
that different settings, both historical and social, yield very different patterns 
in terms of the results of interethnic contact on particular language varieties. 
Furthermore, the patterns are complex ones. We cannot predict, for example, that 
a high degree of interethnic contact will necessarily yield a high degree of lin- 
guistic convergence. In addition, there may be considerable variability in terms 
of the effects of contact within a group. In some cases, other factors significant to 
the construction of identity (gender, social class, and so forth) may delineate 
subgroups within the ethnic group that show a stronger effect of contact than 
others. There is also the possibility that some individuals within the group will 
exhibit unusually high (or low) frequencies of features that stem from contact with 
other linguistic varieties. 

In this section, I will discuss the range of effects that have been documented 
when two or more linguistic varieties come into contact across ethnic boundaries. 
I have selected key cases from the sociolinguistic literature to illustrate each type 
of situation. The settings and individuals discussed have been organized gener- 
ally according to degree of cross-linguistic influence, although these categories 
are somewhat subjective and not intended to make any theoretical claims. 


2.1 Cases of minimal convergence 


As we might expect from decades of research on the sociology of race and eth- 
nicity, the power of ethnic boundaries can be very significant. While interethnic 
contact often leads to some degree of linguistic convergence by the varieties 
involved, it is also possible to have situations of significant interethnic contact 
where there is very little influence of the varieties in question on each other. Perhaps 
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more surprisingly, it is also possible for an individual to maintain a separate 
linguistic identity, even where only contact with an outside ethnic group is avail- 
able, a point to which I will return below. 

One study that focuses specifically on the question of interethnic contact is 
Henderson (1996). Henderson was interested in a group of middle-class African- 
Americans in Philadelphia, who seemed to be completely integrated into 
European-American communities, reporting that they worked and socialized 
regularly with European-Americans. The research question addressed by the 
study was whether or not these speakers would exhibit the “short a” pattern, a 
complex phonological pattern typical of European-Americans in Philadelphia. 
Henderson discovered that the African-Americans in the study, despite their 
multiple interethnic contacts, did not show the “short a” pattern, and she began 
searching for a social motivation for these results. What she found was that despite 
their surface integration into European-American networks, the African-American 
speakers were nonetheless conscious of a racial ideology in the community that 
viewed them “not only as different, but [as] inferior” (1996: 139). In this particular 
setting, then, a lack of convergence with the local European-American variety makes 
perfect sense. These results reinforce the claim by Wolfram, cited above, that dialect 
accommodation stems more from issues of cultural identity than from the direct 
impact of interethnic contact. 

Another interesting case of this type is that of Muzel Bryant, an elderly 
African-American woman, studied by Wolfram and his associates, on Ocracoke 
Island (Wolfram, Hazen, & Tamburro 1997; Wolfram, Hazen, & Schilling-Estes 
1999). Muzel’s family was the only African-American family on the island, which 
meant that she had almost exclusively European-American contacts. Because there 
was no separate African-American community to which Muzel belonged, and 
she had no contacts with African-Americans on the mainland, the researchers 
wondered what patterns they might find in her speech. Would she simply have 
assimilated to the patterns of the European-American islanders? Or would the 
ethnic boundary be strong enough for Muzel’s dialect to maintain its own, 
separate norms, even without regular interaction with a larger African-American 
community? 

The researchers found that Muzel had, in fact, maintained a separate dialect, 
indicative of her African-American identity. She showed relatively few phono- 
logical features of the local island dialect, and in particular did not use [5j] for 
[aj] (as in the pronunciation of high as [hoj]), the most salient marker of Ocracoke 
speech. Instead, her phonological system mainly had features typical of African- 
American English (AAE). Muzel did use some grammatical structures of the 
local Outer Banks variety of English (e.g. marking of third plural noun phrases 
with -s, as in The dogs goes), as well as many structures typical of AAE. Her use 
of the AAE features, though, often differed slightly from the patterns found 
among mainland African-Americans. In general, Muzel’s speech did not sound 
like that of the European-American islanders, despite the high degree of contact. 
She had only a few features of the Ocracoke dialect, and lacked those that 
were most closely tied to local identity. Despite the lack of regular contact with 
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an African-American community, Muzel preserved a number of clear features of 
AAE in her dialect. The research on Ocracoke reminds us that ethnicity can be 
a formidable boundary, even in small, isolated communities, where we might expect 
a higher degree of linguistic convergence than in larger urban centers, such as 
the one studied by Henderson. 


2.2 Cases of moderate convergence 


One case that I have labeled as “moderate convergence” is documented by 
Rickford (1999: 90ff.), who studied two older speakers, a European-American man 
and an African-American woman, in a rural community on the Sea Islands off 
the Southeastern coast of the US. These speakers both had extensive contact with 
members of the other ethnic group, and Rickford predicted that there would be 
some resulting convergence in their dialect patterns. What he found was that the 
two speakers did, in fact, share a large number of phonological patterns, such as 
having stops for interdental fricatives, or using a palatalized /k/ before /a/. As 
a result, the European-American man clearly sounded “like a black Sea Islander” 
(1999: 93), according to Rickford. On the other hand, the speakers showed com- 
pletely different syntactic and morphological systems. The African-American 
woman used the creole grammatical features typical of the Sea Islands (such as 
unmarked plural nouns), while the European-American man showed a complete 
absence of any of these features. Rickford’s interpretation of his data is that 
non-standard phonological features are part of a regional Sea Island identity in 
which both African-American and European-American speakers participate, but 
non-standard morpho-syntactic features are associated with creole speakers, and 
serve as ethnic markers. This study, then, adds a critical factor to our theoretical 
understanding of language contact: the possibility that some components of the 
linguistic system may show a high degree of convergence, while other com- 
ponents show almost none. 

Another case that I have chosen to group with “moderate convergence” is that 
of “Mike,” the pseudonym of a European-American teenager studied by Cutler 
(1999). Mike is an example of a very specific type of cross-linguistic influence: 
“crossing,” which is the deliberate use of a language variety associated with a 
group to which the speaker does not belong (Rampton 1995). Unlike the other 
types of convergence discussed here, crossing is a largely individual process, rather 
than being typical of groups or even sub-groups within a community. In theory, 
crossing could involve any degree of convergence, however in practice it is more 
likely to be limited to minimal or moderate convergence. 

Cutler found that, beginning around the age of 13, Mike crossed into AAE, despite 
his upper middle-class status and his lack of regular contact with African- 
Americans. Mike’s use of AAE features varied greatly depending on the lin- 
guistic area involved, however. In particular, he used many phonetic features 
(e.g. stopping of fricatives — that pronounced [deet]); at one point, Cutler even 
observed Mike “correcting” his pronunciation of the word ask to the stigmatized 
aks [eeks]. He also used numerous lexical items associated with hip-hop culture 
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(such as yo and phat). On the other hand, Mike did not use the core features of 
AAE morphosyntax, such as its complex aspectual system or the non-marking of 
third person singular verbs with -s. Ash and Myhill (1986), among others, have 
suggested that the grammatical elements of a dialect may be harder to acquire 
than lexical items or the phonology, especially for speakers like Mike, with 
limited contacts in the ethnic group associated with the borrowed variety. Like 
Rickford’s study, Cutler’s work highlights the fact that linguistic influence may 
be more significant on some areas of the linguistic system than on others. 


2.3. Cases of significant convergence 


Earlier, I discussed some cases where a high degree of contact led to minimal lin- 
guistic convergence. A natural follow-up question is: What does the other end of 
the continuum look like? In other words, what degree of linguistic convergence 
is possible? We know that a person may acquire as their native dialect a variety 
associated with another ethnic group, for instance in the case of a child adopted 
into a family of a different ethnic background, but this case would not be, strictly 
speaking, an example of linguistic contact. How much convergence is possible, 
though, in situations where interethnic contact is involved? 

Sweetland (2002) documents a case of very high linguistic convergence. 
She focuses on one individual, a young European-American woman, “Delilah” 
(a pseudonym), who grew up in a predominantly African-American area of 
Cincinnati, Ohio. Delilah is completely integrated into an African-American 
network of peers, and speaks a variety of AAE as her primary linguistic code, 
despite coming from a European-American family of origin. Unlike Mike, whose 
command of AAE syntactic features was weak, Delilah exhibits complete fluency 
in the syntactic and morphological patterns of AAE. In the phonological system, 
Delilah also shows some AAE features; there are, however, other AAE variables 
that Delilah never uses, including some highly salient features such as the 
metathesis of ask to aks mentioned earlier. Sweetland attributes Delilah’s “phono- 
logical restraint” to “her sensitivity to the norms of the black community rather 
than evidence that she is unaware of those rules” (2002: 532). Excessive use of 
AAE phonological patterns by young European-Americans outside the commu- 
nity is often the subject of overt negative commentary, so Delilah is careful to avoid 
these patterns. 

On the other hand, Sweetland reports that Delilah was often mistaken for an 
African-American speaker over the phone, possibly because of a high use of AAE 
intonation patterns. In sum, then, while she avoids certain forms, Delilah shows 
an extreme degree of influence from AAE on her speech. Again, though, it is 
not the amount of physical interaction that is most significant here. It is Delilah’s 
complete social integration into an African-American peer group, to the point that 
one of her peers describes her as “basically black” (2002: 525). 

Patterns of widespread convergence can be located at the community level as 
well. Childs and Mallinson (2004) looked at a group of African-American speakers 
in Texana, a small Appalachian mountain community. They found that the 
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younger generation of African-American speakers displayed a number of key 
features of the Appalachian English variety used by European-Americans in the 
region. These included both phonological features (such as /ai/-glide reduction) 
and morphosyntactic features (such as past tense be leveling). At the same time, 
these younger speakers showed a decline in the AAE features that were used by 
the older generation of African-American residents. Instead, they seemed to 
mark their ethnic identity mainly with lexical items, such as slang terms borrowed 
from hip-hop culture. As with the previous studies, Childs and Mallinson looked 
to sociocultural factors to explain this pattern. They found that the ideologies 
of the community, particularly as perceived by the younger residents, did not 
construct regional and ethnic identity as being in conflict, a pattern which is 
common in other communities. This context allowed for a high degree of linguistic 
convergence. Nonetheless, in both cases that have been discussed, the individuals 
or groups in question maintained some features that marked a separate ethnic 
identity. 


3 Contact between Majority and Minority 
Ethnic Varieties 


A very common pattern of interethnic contact, probably the most common, is that 
of contact between members of a minority ethnic group and members of the socially 
dominant ethnic group. If the two groups have different linguistic varieties, we 
might look for there to be some influence between them, depending on the social 
context (and keeping in mind the wide range of possibilities discussed in the 
previous section). Since the dominant variety is the one that will be privileged 
in the school system, attempts may be made to impose it on speakers of other 
varieties. We know from decades of sociolinguistic research, however, that lan- 
guage corresponds strongly to identity, and that it is not easy to mandate how 
an individual will speak (barring Draconian measures such as making it illegal 
to speak a particular language or variety). In situations where the social context 
favors multiple interethnic contacts, however, some speakers who identify with 
minority ethnic groups may choose to incorporate features from a dominant 
variety into their speech (and vice versa, although this latter possibility will be 
limited by the fact that the minority variety is usually socially disfavored). Again, 
the particular social and linguistic ideologies at work will affect the degree and 
nature of linguistic assimilation. Conversely, studying the patterns of linguistic 
assimilation can illuminate the complex nuances of social structure within a 
community. 

Labov (1972a) conducted a (now classic) study on Martha’s Vineyard that 
illustrates these sorts of patterns, and encapsulates the complex ways in which 
ethnicity, social structure, and language interact. The island, at the time, had 
three main ethnic groups: residents of English descent, residents of Portuguese 
descent, and Native Americans. In addition, a critical social variable in this 
community was the degree to which speakers were oriented toward island life 
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and the local community, as opposed to, for example, seeking jobs or having 
friends on the mainland. This factor of local versus extra-local orientation is given 
particular weight in small communities, where economic opportunities may be 
limited (see Fought 2006 for a more detailed discussion). On Martha’s Vineyard, 
the younger generation of adults often sought work off the island, or had aspir- 
ations to live on the mainland. 

Labov studied the centralized variants of (ay) and (aw), which were associated 
in the community with local island identity. He found that among younger 
speakers of English descent, these variables were used less frequently than in the 
older generation, as might be expected, given their orientation toward opportu- 
nities on the mainland. The younger generations of Portuguese-American and 
Native-American speakers, however, used relatively more of these key variants. 
The higher use of the variables among young speakers in the two minority 
ethnic groups signaled a desire to assert their ties to local island identity, ties which 
had been contested historically due to ethnic prejudice against them. 

Since this early study, many researchers have confirmed that minority ethnic 
group speakers may borrow elements of a majority variety, as well as participating 
in sound changes that characterize the variety used by the dominant ethnic 
group in the region. One interesting study was conducted by Fridland (2003) in 
Memphis, Tennessee. She selected this location because of the nature of the social 
and historical patterns of interethnic contact there, since Memphis has always 
had a large and socially prominent African-American population, including in 
the middle and upper classes. Fridland focused on whether or not the Memphis 
speakers were participating in the Southern Vowel Shift (a set of changes in progress 
in the southern United States). She found that the vowel systems of African- 
American speakers in her sample closely paralleled those of European-American 
speakers, in terms of which vowels were and were not shifted. Although the African- 
American speakers in the study exhibited features that specifically indexed black 
ethnic identity, the shared regional features, originating in the dominant group, 
played an important role as well. Fridland speculates that the status of both 
AAE and white Southern dialects as “less favored varieties” in the US may also 
facilitate influences from the majority variety onto that of the minority ethnic group, 
by weakening the majority variety’s association with dominance. 

The role of Latino speakers in US sound changes has been studied relatively 
infrequently, but it can provide some new perspectives on dialect contact. Fought 
(1997; 2003) looked at Chicano English in Los Angeles and the construction of 
ethnic identity among young Mexican-American speakers there. The studies 
found that the variety of Chicano English used by these speakers was systemat- 
ically different from other local varieties, showing the influence of Spanish, as well 
as independent historical developments. In addition, however, their Chicano 
English included features that were clearly attributable to the local California 
variety used by European-American speakers in the area. These included lexical 
items, such as the use of be like and be all as quotatives, and more importantly, 
features of sound changes in progress, such as the fronting of /u/ and the 
backing of /ee/. Interestingly, the increased use of these features by individuals 
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did not show any correlation with the degree of interethnic contact. Instead, it 
correlated with a combination of other factors, such as social class, gender, and 
gang membership. The results of this study again confirm the comment by 
Wolfram, cited earlier, that influences across dialects are more likely to be about 
identity than about opportunities for interaction and contact. 

A study of sound change that includes speakers from a number of different 
ethnic groups is Gordon (2000), which focused on the Northern Cities Shift, an 
important set of sound changes taking place in a large inland northern section 
of the United States. Gordon’s research took place in northwest Indiana, and 
included European-Americans, African-Americans, Mexican-Americans, and a small 
group of multiracial speakers (mostly Latino/white). He found that, in general, 
the speakers who participated in the vowel shifts were predominantly white, 
confirming that ethnicity can, in some cases, serve as a strong barrier to the assim- 
ilation of language features. Interestingly, some of the multiracial speakers also 
showed a high degree of usage of the Northern Cities Shift vowels. While four 
of the five Mexican-American speakers in Gordon’s study did not participate in 
this sound change, the fifth, a young woman, showed rates of all the features that 
were very high, higher than those of some European-American speakers. Gordon 
attributes this result to the fact that the speaker grew up in a predominantly white 
neighborhood, with a mainly white peer group, in contrast to the other partici- 
pants. This individual also did not emphasize the importance of Mexican ethnicity 
to her identity in the way that the other Mexican-American speakers did. In this 
case, it is clear that cultural identity factors were involved, which may have been 
strengthened by the degree of interethnic contact as well. 

While there are numerous examples of the dominant variety of a region 
influencing speakers of a minority ethnic variety, there has been relatively little 
research on possible influences in the other direction. Given that varieties asso- 
ciated with minority ethnic groups are often subject to negative social evaluation, 
we might expect a low degree of influence from these varieties onto the more 
prestigious dominant variety. There are, however, some examples of exactly this 
type of influence in the sociolinguistic literature. 

One place where speakers of a majority variety have been found to exhibit 
a clear influence from a minority variety is New Zealand. A number of recent 
studies have documented the influence of speech styles associated with a minority 
ethnic group (Maori speakers) on the dominant group (white New Zealanders). 
Holmes (1997b), for example, suggests that Maori English may be the source of 
a recent sound change in New Zealand: the increasing use of final /z/ de-voicing. 
Additionally, she traces patterns of syllable timing in New Zealand to the 
influence of Maori English, particularly the variety spoken by middle-class 
Maoris. A separate study by Holmes (1997a) found two more features that 
seemed to have spread to Pakeha (white) speakers’ English from Maori English 
speakers: the tag particle eh, and the use of the high rising terminal contour. In 
addition to these structural features, there is also the widespread use throughout 
New Zealand of a Maori term to designate the majority population of European 
descent: Pakeha. Furthermore, it seems as if the influence from the minority 
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variety is increasing, rather than declining: Holmes (1997a) observes that Pakeha 
teenagers who identify with a strong Maori peer group tend to use Maori English. 

In the United States, there has also been some work tracing historical 
influences of AAE onto varieties spoken by the dominant ethnic group, particu- 
larly in the South. Wolfram (1974), for example, noted that European-Americans 
in rural areas sometimes exhibited copula absence as a result of contact with African- 
American varieties. Similarly, Feagin (1997) found evidence that non-rhoticity (lack 
of post-vocalic /r/) in European-American dialects of the South could be traced 
to contact with AAE as well. Moreover, in the area of the lexicon, the influence 
of AAE on American English is well documented (see Baldwin 1997; Smitherman 
1997; 1998). Lexical items originating in AAE that have expanded to general use 
in American English include cola, gorilla, jazz, tote, bad (for ‘good’), and many 
others (Smitherman 1997). New words and phrases, spread by the media, flow 
constantly from AAE to the varieties of speakers in other ethnic groups (particu- 
larly younger speakers). The negative social evaluation of the minority variety, 
discussed earlier, is irrelevant, because these borrowed terms are usually considered 
“slang”; in fact, the covert prestige associated with AAE, drawing on connota- 
tions of urban “hipness,” is an asset in this context. 


4 Contact among Minority Ethnic Varieties 


In settings where there are multiple minority ethnic groups, a further possibility 
exists: that of dialect contact between two minority ethnic varieties. Furthermore, 
this type of contact may yield an even higher degree of assimilation than contact 
between minority dialects and the dominant dialect of the region. To begin with, 
different minority ethnic groups in some settings may express an ideology that 
focuses on their shared exclusion from a white middle-class dominated world. A 
Puerto Rican American man in Urciuoli’s (1996) study, for example, commented, 
“Ym more comfortable with blacks than with whites because blacks live in the 
same environment as us, they relate to us better than whites” (1996: 66). The view 
expressed by this speaker encompasses both ideological considerations and 
practical ones, such as the fact that different minority ethnic groups may live in 
close proximity (often as a result of socioeconomic factors) and experience more 
contact with each other than with members of the dominant group. 

Again, it is worth revisiting a classic variationist work which illustrates how 
these processes might work: Wolfram’s 1974 study of Puerto Rican-Americans and 
African-Americans in New York City. This study was one of the first to focus 
specifically on contact between two minority ethnic groups, rather than on con- 
tact between dominant and minority groups. In particular, Wolfram analyzed the 
language of young male Puerto-Rican speakers in the community, who often had 
multiple social contacts with African-American speakers in their peer group. He 
found that they incorporated a number of variables from AAE into their own vari- 
ety of English (Puerto Rican English), including grammatical and phonological 
items. Within the group, however, there were other factors that influenced the 
degree of use of AAE features. 
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One of the most relevant issues was how perceptions of phenotype in the com- 
munity affected the integration of Puerto Rican speakers into African-American 
peer groups. Some Puerto Rican American individuals were perceived as look- 
ing more “white,” while others were perceived as looking more “black.”* Those 
whose phenotype was categorized as “white” tended to be more likely to assimi- 
late into white peer groups, and be relatively more upwardly mobile. If they showed 
any influence from AAE it was generally only reflected in the use of phonological 
features. Those who had a darker skin tone, in contrast, were more likely to 
assimilate to African-American culture. As might be expected, the highest use of 
AAE features occurred among those with extensive contact with African-American 
peers. Speakers in this group used AAE grammatical forms such as habitual be 
or negative inversion (e.g. Didn't nobody do it), as well as AAE phonological forms, 
e.g. surface realizations of /8/ as [f], or monophthongal /ay/. The speakers who 
showed this pattern, according to Wolfram, also tended to minimize differences 
between the two groups, even to the point of “deny[ing] that the ways in which 
blacks and Puerto Ricans speak English are different” (Wolfram 1974: 37) This is 
a clear illustration of how sociocultural attitudes within the community can have 
a strong influence on the degree of dialect assimilation. 

Though the clearest cases of this type in the US involve contact between Latino 
and African-American groups, effects of contact between other groups have also 
been documented. Chun (2001), for instance, analyzes a conversation in which a 
Korean-American speaker uses AAE features. The particular case she looks at 
involves “crossing” (similar to the case of “Mike” mentioned above), rather than 
contact that influences the variety of an entire community. During the conversa- 
tion, the Korean-American speaker uses AAE to draw on stereotypes of African- 
American identity that reinforce his masculinity. By doing so, he constructs a distinct 
Korean-American identity that challenges mainstream US ideologies about how 
Asian-Americans should behave. It is probable that there are many more settings 
of this type, involving Asian-Americans or other groups, that simply have not 
been studied yet by sociolinguists. In terms of the model of language contact 
presented here, however, where sociocultural attitudes are privileged over the 
logistics of contact, it makes sense that members of minority ethnic groups might 
find it appropriate to borrow from other minority linguistic varieties. 


5 ‘Tri-Ethnic Settings and Multiple Varieties 
in Contact 


Although the discussion above has focused on contact between two ethnic 
groups, in the multi-ethnic, multilingual communities that now exist in many places, 
linguistic contact may also take place among multiple groups. Often a variety asso- 
ciated with a minority ethnic group may come into contact simultaneously with 
varieties spoken by other minority groups in the area and by the dominant group, 
making the linguistic and social situations more complicated. A number of eth- 
nically complex settings of this type have been the sites of recent sociolinguistic 
research. 


294 Carmen Fought 


One of the most interesting tri-ethnic settings to be studied is North Carolina, 
home of the Lumbee Indians, as well as long-standing populations of African- 
Americans and European-Americans. In the region that has been the setting for 
most of the research, these three ethnic groups have been in contact for almost 
300 years (Wolfram & Schilling-Estes 1998). Of particular interest is the fact that 
the Lumbee lost their heritage language (or languages) long ago, a factor which 
has contributed to their being denied recognition and privileges given to other 
Native-American groups. The linguistic aspect of their construction of ethnic 
identity, then, must take place in English. 

Wolfram and his associates were interested in finding out how the Lumbee 
chose to signal their identity linguistically. Would they develop features in their 
English that were completely unique? Would they borrow features of AAE (the 
other minority variety in the area)? Would they assimilate completely to the local 
Southern variety and become indistinguishable from European-Americans in the 
region? The results show a pattern of influences that in many ways mirrors the 
complex social and ethnic history of the Lumbees themselves. 

Research on the variety of English used by the Lumbee (e.g. Wolfram & 
Schilling-Estes 1998; Wolfram & Dannenberg 1999) found certain features that index 
Lumbee identity specifically, and are not shared by other groups in the area. Among 
these are the regularization of was to were (He were home), and the use of perfec- 
tive be (I’m been to the store). The Lumbee variety also incorporates some features 
of AAE, but uses them somewhat differently; habitual be, for instance, is some- 
times used in non-habitual contexts, e.g. I hope it bes a girl. A combination of dis- 
tinct forms with forms that clearly stem from linguistic contact was also found 
in the Lumbee phonological system and lexicon. Lumbee speakers show evidence 
of phonetic features that are typical of the geographic region, and shared by all 
the local ethnic groups. In addition, though, they used some phonetic features 
not found in the other two ethnic groups, such as /ay/ raising and backing. The 
Lumbee are in some ways an ideal case study of ethnicity and language contact: 
they construct their ethnic identity linguistically, despite the loss of their ances- 
tral language, in part by borrowing elements from other ethnic groups in the area, 
and then using them in distinct ways. 

Hazen (2000) also looked at a setting of tri-ethnic contact in North Carolina. As 
in the studies cited earlier, he found that the Native American speakers in his 
study showed some use of features of AAE and some use of features associated 
with European-American varieties in the region. Hazen identified a crucial factor 
that interacted with ethnicity, however, in explaining the patterning of variables 
in his study: expanded identity versus local identity. Local-identity speakers were 
those who mostly maintained contacts within the local community; expanded- 
identity speakers were more oriented toward contacts and opportunities outside 
the community (e.g. attending or planning to attend college). The local-identity 
Native-American speakers in the community showed a stronger influence from 
the grammatical and phonological features characteristic of African-Americans. 
In fact, Hazen suggests that because of the convergence between Native- 
American and African-American varieties in this community, certain AAE 
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features “may simply be marked as ‘young, non-European American’” (2000: 141). 
The expanded-identity Native American speakers were more like to incorporate 
features of the European-American variety, which makes sense, given that this is 
the variety that is privileged in the wider community, e.g. in academic settings. 

Another ethnically complex situation is found in post-Apartheid South Africa. 
Of the places in the world that have been studied by sociolinguists, South Africa 
is one of the most politically, ethnically, and linguistically complex, and there is 
not space here to describe the situation fully.’ There are at least three types of 
ethnic groups, however, whose linguistic varieties stem from contact between 
multiple languages and groups. 

Among Black South Africans, a number of African languages are spoken; 
in addition, though, Gough (1996) reports that at least a partial language shift 
seems to have taken place, with the mother tongue declining in favor of English. 
He attributes this to the growing number of Black students attending completely 
English-medium schools. Historically, in Black South African communities, 
English was a functional language for most speakers of the older generation, used 
in formal public situations or as a lingua franca for those who did not share another 
language. The particular type of English spoken, then, is an emerging variety, Black 
South African English (BSAfE), which grew out of a setting where most speakers 
were second language learners. Its specific characteristics, as described by 
Gough, clearly show the influence of contact, as in, for example, the reduction of 
the English vowel system to either five or seven vowels, which is more typical 
of African languages. 

The linguistic situation of Indian South Africans is somewhat different, 
although it also shows the effects of contact. Although originally this ethnic group 
consisted of immigrants who brought other languages with them, the majority 
of these speakers are now monolingual in English, and have lost contact with 
the heritage languages of their ancestors (Kamwangamalu 2001; Mesthrie 2002b). 
As with BSAfE, Indian South African English (ISAE) shows the effects of having 
developed in a language contact situation, even when used by completely 
monolingual speakers, and includes patterns such as syntactic topicalization 
or retroflexion of consonants that are found in the languages of the Indian 
continent. 

The groups that during Apartheid were labeled as “coloured’”* also developed 
contact varieties as the primary mode of communication. McCormick (2002a) 
studied one such community, in an area known as “District Six,” which was the 
site of great ethnic and linguistic diversity in the history of its settlement. 
Because the heritage languages found in District Six were so numerous and diverse, 
the two colonizer languages of South Africa — Afrikaans and English — became 
the dominant languages of the community (McCormick 2002a). Both standard lan- 
guages, however, developed into mixed codes, and in particular, a non-standard 
variety of Afrikaans, known by the local term kombuistaal, became a significant 
variety associated with the community. Kombuistaal evolved out of contact with 
English and contains a large number of English lexical items. Code-switching 
between English and Afrikaans varieties is also common. Interestingly, these mixed 


296 Carmen Fought 


codes, despite being non-standard, are valued positively by community members, 
and the use of “pure” English or Afrikaans is often openly criticized and cen- 
sored. Again, sociocultural considerations in the construction of ethnic identity 
are crucial to an understanding of the linguistic situation. The people in District 
Six were denied the rights and privileges accorded to whites by the apartheid 
government because of their (presumed) racially mixed ancestry. As McCormick 
notes, “In District Six, a concern for linguistic purity came to be seen as the province 
of those whites who had declared them ‘other’ and rejected and often humiliated 
them” (2002a: 222). It is not surprising, then, that a “mixed” code, symbolic of a 
“mixed” ethnic heritage, came to be so positively valued in this setting. 


6 Questions for Future Research 


While it may seem that a substantial amount of research has been done on the 
role of language contact in the construction of ethnic identity, much still remains 
to be explored. There are numerous parts of the world where the linguistic con- 
sequences of ethnic contact have yet to be explored. Even within the areas that 
have been well documented, such as the US or South Africa, some groups have 
been much less studied than others. Sociolinguists know very little about Asian- 
American groups in the US, for example, though the research we do have sug- 
gests that some very interesting patterns might be found among Asian-American 
speakers, as in the Chun (2001) study, cited earlier. 

Perhaps the most critical question, in a world that is increasingly diverse, 
is how multiracial speakers construct their ethnic and linguistic identities. 
Multiracial speakers represent within themselves a process of interethnic contact, 
yet we know very little about how this process is reflected linguistically. We know, 
from previous research, that these individuals may choose to identify with dif- 
ferent ethnic identities at different times in their lives, and may even choose to 
“pass” as a member of some other group than those in their biographical history 
(Bucholtz 1995). How are these complex social processes mirrored in language? 
The number of multiracial speakers in the US and other Western countries has 
been growing steadily. For sociolinguists, they hold a key to understanding the 
role of linguistic contact in the construction of ethnic identity. 


NOTES 


1 Interestingly, however, Labov’s later work shifts completely away from ethnicity. 
Labov 1994, for example, if one goes through the index, has no listing for “ethnicity,” 
and lists only three pages related to “ethnic groups, and linguistic change,” out of a 
total of 605 pages of text. 

2 Zentella (1997) also looked at Puerto-Rican Americans in New York City, and found 
that they commented on phenotype differences even in situations where she herself 
could not perceive any clear difference. 
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3 See, for example, de Klerk (1996), Mesthrie (2002a), and McCormick (2002b) for a more 


complete discussion. 


4 This term may now have negative connotations, however. 
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15 Contact and 
Sociolinguistic Typology 


PETER TRUDGILL 


Linguistic typology is that part of linguistics which attempts to describe and explain 
the diversity of structural features in the world’s languages (see Nichols, this 
volume). 


1 Genetic and Areal Factors 


It is generally assumed that the distribution of structural features over the 
world’s languages is not simply random. For example, it is not surprising that 
languages which are genetically related have certain characteristics in common, 
having inherited them from some ancestral language. Most of the Germanic lan- 
guages such as Swedish, Dutch, and Frisian have the “verb second” rule which 
is not found in unrelated languages: 


(1) No kjem ho 
‘Now comes she’ (Norwegian) 


It is also well known that there is an areal or Sprachbund component to the dis- 
tribution of structural features. For example, in phonology, velaric ingressive con- 
sonants (“clicks”) are found in the languages of southern Africa, regardless of genetic 
affiliation. Similarly, nearly all the languages of the South Asian subcontinent, of 
whatever language family, have retroflex consonants. And glottalic egressive con- 
sonants (ejectives) are found in a number of parts of the world, but in Europe 
they occur only in the languages of the Caucasus. A similar cross-family occur- 
rence is found for phonemic tone in languages in East and South-East Asia. 
Also in Europe, front rounded vowels of the type /y/, /y/, /9/, /ce/ are found 
in a remarkably contiguous area of the northwestern part of the continent which 
stretches from northern Norway to the Alps and from western Ukraine to the 
Atlantic. Languages which have at least one such vowel include Norwegian, 
Swedish, Danish, Finnish, Estonian, Dutch, Frisian, German (High and Low 
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forms), Hungarian, Breton, French, and Occitan. On the northern fringes of the 
area Southern Sami has /@/; and on the southern fringes, front rounded vowels 
are found in northwestern Italian varieties, certain Romansch dialects, and north- 
ern Basque varieties spoken in France, but not in other varieties of these languages. 
Specifically, the area excludes all the other Sami languages, Spanish, Catalan, 
all the other Basque dialects, the Baltic and Slavic languages, Rumanian, other 
Romansch dialects, and most of Italian." 

Similar areal phenomena occur in the case of grammatical and semantic fea- 
tures, such as the postposed definite articles of the Balkan languages Albanian, 
Bulgarian, and Rumanian; and the absence of infinitives from these three languages 
as well as Greek (Joseph, 1983) and Romani. 

In the wider European context, this type of phenomenon has been treated 
by Haspelmath (2001) under the label, derived from Whorf (1956), of Standard 
Average European. The precise details of membership of this area are disputed (see 
Heine & Kuteva, 2006: 1-47) but it is agreed that typical European languages share 
a number of features which are not found in a majority of the rest of the world’s 
languages, including definite and indefinite articles; a possessive-perfect; com- 
parative marking of adjectives; and subject-verb inversion in questions. The core 
members of this linguistic area, with the highest number of Standard Average 
European features, include French, German, Dutch, Spanish, Portuguese, Italian, 
and Albanian. 


2 Sociolinguistic Typology 


In this chapter, however, we examine the possibility that the distribution of lin- 
guistic features over the languages of the world may be non-random in another 
sense. Sociolinguistic typology examines the possibility that different types of lan- 
guage or linguistic structure may be, or may tend to be, associated with different 
types of society or social structure (Trudgill, 2002; forthcoming b). Previous work 
in this field has indicated that major social factors that may have consequences 
for language structure are likely to include the following parameters: 


small versus large community size (e.g. Haudricourt 1961) 
dense versus loose social networks (Milroy & Milroy 1985) 
social stability versus instability (e.g. Dixon 1997) 

high versus low degree of shared information (e.g. Perkins 1992) 
degree of contact versus isolation? 


or WNFR 


Obviously, it is the last of these, contact versus isolation, that is of interest to 
us in the context of the present volume, although most if not all of these para- 
meters are interdependent in various ways. 

This type of work, then, has little to do with genetic affiliation; but it is linked 
to areal distribution. For example, if we ask how the linguistic areas we have just 
been discussing came about, it is agreed that velaric ingressive consonants were 
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originally a feature only of the indigenous Khoisan languages of southern Africa, 
and that they subsequently found their way into the Bantu languages which cur- 
rently also have them, such as Xhosa and Zulu, as a consequence of contact between 
the indigenous and Bantu languages as speakers of the latter languages penetrated 
into the south of the continent from their original homelands further north. 

Similarly, vowels of the front rounded type are also rare, so to find them in a 
single geographical area, and in so many European languages which are in many 
cases unrelated or not closely related, suggests that this is not a coincidence. The 
supposition has to be that this areal phenomenon is also at least in part the result 
of the (spatial) diffusion of this feature from one language to another. This is most 
obviously so in the case of the front-rounded vowels of Breton and the relevant 
varieties of Basque, whose speakers are all bilingual in French, and South Sami, 
whose speakers are generally bilingual in Swedish or Norwegian — and of course 
diffusion from one language to another cannot occur without contact between 
speakers of those languages. 

And we can suppose that Standard Average European might have, say, 
“have”-perfects because this feature started life in one of the languages and 
subsequently spread to the others: according to Heine and Kuteva (2006: 154), the 
European possessive perfect of the type I have done it “emerged in Late Latin as 
a distinct periphrastic active aspect category.” 


3 Contact and Complexification 


Clearly, then, contact can have an influence on the languages involved. In the 
context of specifically sociolinguistic typology, however, the important question 
concerns the extent to which the contact that speakers of a particular language 
variety have, or have had, with speakers of other varieties can have an influence 
on the nature of that language variety. In this paper I attempt to move toward an 
answer to this question in terms of one particular parameter, namely the con- 
sequences that contact may have for the relative linguistic complexity of language 
varieties.” 

In fact, a study of the relevant literature shows that there is considerable 
evidence that language contact can lead to an increase in linguistic complexity. 
This appears to occur as a result of the addition of features transferred from one 
language to another: for example, Dahl (2004: 127) refers to “contact-induced 
change” and the spread of grammatical elements from one language to another. 
This is then additive change in which new features derived from neighboring lan- 
guages do not replace existing features but are acquired in addition to them. A 
sociolinguistic-typological consequence of language contact is therefore that lan- 
guages which have experienced relatively high degrees of contact are more likely 
to demonstrate relatively higher degrees of this type of additive complexity than 
languages which have not. 

This supposition is supported by the work of Nichols (1992). Nichols examined, 
amongst a number of other phenomena, morphological complexity in 174 different 
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languages from all parts of the world and from a very wide range of language 
families. 

Morphological complexity for Nichols has to do with morphological marking 
of syntactic relations: “any form of inflection, affixation, cliticisation, or other overt 
morphological variation that signals some relevant relation, function, or mean- 
ing” (1992: p. 48). Marking may consist of: 


indexing — e.g. marking of person and number of the subject on the verb; 

coding — e.g. cases on nouns marking functions such as subject; 

registering — i.e. marking the presence of another word without indexing its 
features, e.g. the definite conjugation of Hungarian verbs, which registers the 
presence of an object. 


Nichols outlines a working definition of morphological complexity which relies 
on a count of the degree to which a language has dependent marking, head marking, 
and detachment (“free” marking), as opposed to zero marking. (Head marking might 
be, for example, the marking of a possessive construction on the item possessed; 
dependent marking on the possessor; and detachment the use of independent forms 
such as clitics or pronouns. All of these would give a count of 1, as opposed to 
zero marking, which would involve simple juxtaposition of the possessed and the 
possessor, which would obviously score 0.) Given these three marking possibil- 
ities, and the nine categories’ Nichols examines, scores in principle range from 0 
to 27 (Nichols 1992: 64), but in Nichols’ large sample of languages, actual scores 
go only from 2 to 15. 

It is interesting to note that the languages in the sample with the highest degrees 
of complexity, so measured (14-15 in Nichols’ scoring), are the (unrelated) 
ancient languages of the Near East, Sumerian and Akkadian; the Northern 
Australian languages, Mangarayi and Djingili; Basque; and the North American 
Utian language Southern Sierra Miwok. There are no languages from Nichols’ 
Oceanic area category among the top 27 most complex languages. 

The least complex languages, so measured (2-3), are the Khoisan language !Kung; 
the North America isolates Chitimacha (a now extinct language of Louisiana) and 
Zuni (New Mexico); the North Asian isolate Gilyak/ Nivkh (Sakhalin and Amur, 
Russian Far East), Central American Mixtec; the (possibly) Nilo-Saharan Songhai 
(spoken in Mali); and Southeast Asian Miao/Hmong and Mandarin Chinese. There 
are no languages from Nichols’ Australian or Ancient Near Eastern categories among 
the top 26 least complex languages. 

This measure of complexity according to Nichols “correlates with overall 
morphological complexity and hence can be used as an index of something real.” 
But Nichols concedes that it “overlooks a good deal of the actual morphological 
complexity of languages, both in omitting categories commonly signalled by 
morphology (e.g., tense and aspects on verbs) and in considering where and 
whether something is marked but not how (e.g., by inflection, agglutination, 
cliticisation, or incorporation)” (p. 64). This latter “how” point, which seems to 
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me to be of more consequence than Nichols perhaps accords it, will be discussed 
further below. 

Nichols also examines the characteristics of languages which occur in subcon- 
tinental spread zones as opposed to those which occur in residual zones. Spread zones 
show little genetic diversity, low structural diversity, “shallow” language fam- 
ilies, a history of rapid spreading of languages; and typically have an innovating 
center and a conservative periphery, and lingua franca use of spreading languages, 
with no long-term increase in diversity. Such zones include western Europe 
(Indo-European), central Australia (Pama-Nyungan), interior North America 
(Algonquian and Siouan), and central Oceania (Austronesian). 

Residual zones have high genetic diversity, high structural diversity, “deep” 
language families, no appreciable spread of languages or families, no obvious inno- 
vation center, a long-term increase in diversity, and no widespread lingua franca. 
Examples are the Caucasus, the Pacific North West of the USA and Canada, the 
Balkans, and New Guinea. Residual zones are often to be found on the periphery 
of spread zones. 

The crucial point for our discussion of additive complexity is the fact that, as 
Nichols writes (1992: 192), “spread zones show lower complexity relative to their 
continents” while residual zones show higher complexity. But crucially, “inde- 
pendently of whether they are in residual or spread zones, however, almost all 
these high-complexity languages are in areas of considerable linguistic diversity 
and contact. It can be concluded that contact among languages fosters complex- 
ity, or, put differently, diversity among neighbouring languages fosters complexity 
in each of the languages.” 

An excellent example of this kind of phenomenon which supports Nichols’ claim 
is provided by the work of Aikhenvald in Amazonia (Dixon & Aikhenvald 1999; 
Aikhenvald 1996; 1999; 2002; 2003). In the Vaupes river basin area of northwest 
Amazonia on the borders of Brazil and Colombia, “obligatory multilingualism” 
is the norm, “dictated by the principles of linguistic exogamy (one has to marry 
someone who speaks a different language)” (Aikhenvald, 2003: 1). Tariana is the 
only member of the Arawak language family spoken in the area, where it has 
been in long-term contact with languages of the East Tucano branch of the 
Tucano language family. As a result, there are many signs of Tucano influence 
on Tariana, some of which, crucially, is additive. For example, the consequences 
of this contact for complexification are apparent in Aikhenvald’s information that 
“Tariana is one of the very few Arawak languages with case marking for core 
syntactic functions” and that “it developed the case marking under Tucano 
influence” (p. 3; see also p. 139). 

Similarly, considerable additive influence from the Tucano languages can be 
seen in the Tariana five-way system of evidentials. As in many other languages, 
there is a grammatical requirement in Tariana to state explicitly the source of one’s 
information. The evidential system intersects with the 3 tenses, with 2 lacunae, 
to give 13 different enclitic markers. In the remote past tense the five evidential 
verb suffixes are: 
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Visual: -na 
Nonvisual: -mhana 
Inferred (a) generic: -sina 
(b) specific: -nhina 
Reported: -pidana 


The visual evidentials refer to events which have been seen, as in The dog bit 
him [we have seen it]; nonvisual to events heard or otherwise sensed nonvisually, 
as in The dog bit him [we have heard the noise]; the two inferred categories relate 
respectively to “information obtained by reasoning or common sense through 
observing evidence of an event or state without directly experiencing it” and 
“information obtained through observing direct evidence of an event or state,” 
as in The dog bit him [he has a scar]; and reported refers to second-hand or 
third-hand information, as in The dog bit him [someone told me].° Importantly 
for our purposes, Aikhenvald states that while the reported evidential was 
probably inherited from Proto-Arawak, the other evidentials are found in 
Tariana because of contact — “as a result of areal diffusion from Tucano languages” 
(p. 293). 

Very many other examples could be given. For example, Foley (1986) discusses 
the relationship between the neighboring but genetically unrelated Papuan lan- 
guages Yimas, Alamblak, and Enga. He suggests, amongst a number of other 
candidates for contact-induced morphological category-addition, that switch- 
reference systems and morphemes may have spread from one language to 
another. Yimas does not have a switch-reference system, but the other two lan- 
guages do. Foley suggests that because “different-actor dependent verb forms are 
not very common in the Sepik area languages, diffusion of this feature from Enga 
into Alamblak is possible” (1986: 267). Tariana, too, seems to have acquired 
switch reference as a result of contact: “since switch reference is not found in any 
other Arawak languages of the Upper Negro River area and is widespread in east 
Tucano languages, it is likely to have entered Tariana as the result of areal dif- 
fusion” (Aikhenvald, 2003: 515). 

We also saw above that languages can acquire additional phonological features 
from other languages, such as clicks and front rounded vowels. Many other cases 
could be given: Rivierre (1994) supplies a range of phonological examples of what 
he refers to as contact-induced phonological complexification in the Austronesian 
languages of New Caledonia, such as the acquisition of voiceless aspirated con- 
sonants in Koné-area languages from Polynesian and other sources. 

Language contact can thus lead to borrowing of an additive type, as well as 
of a replacive type where a new feature replaces an old one. And to the extent 
that change is additive, we can suppose that languages spoken in communities 
with high degrees of contact with other communities will tend have more such 
additive features than those which do not. The sociolinguistic-typological con- 
sequence, as we supposed above, is that high-contact societies are more likely 
to have languages with higher levels of additive complexity than more isolated, 
lower-contact societies. 
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The notion of linguistic complexity, however, is itself a complex one (see Dahl 
2004; Sampson et al. 2009), and one which cannot simply be couched in terms of 
numbers of features. For example, it is obvious that those Bantu languages of south- 
ern Africa which possess clicks have acquired them as the result of contact with 
Khoisan; and the addition of a whole new manner of articulation and thus a whole 
new series of consonantal articulations to the Bantu languages in question can be 
regarded as representing an additional degree of contact-induced linguistic com- 
plexity. But it is also relevant that clicks are articulatorily complex, and very rare 
indeed in the world’s languages; and phonological systems with them can be said 
to have a degree of complexity which systems without them do not: consonantal 
systems with clicks may be inherently more complex than systems without (note 
that this is true of !Kung, which in terms of Nichols’ measurements came out as 
having low morphological complexity). Front rounded vowels are also rare, as 
we have already noted, occurring in only about 9 percent of the world’s languages, 
according to the data presented in Maddieson (1984: 248-51). Vowels of this type, 
which are perceptually complex, are sometimes regarded by phonologists, like 
clicks and ejectives, as marked articulation types (though see Haspelmath 1996), 
being not only rare but mastered late by children during first language acquisi- 
tion, and highly susceptible to loss during linguistic change.° 

Even replacive borrowing can illustrate this same point, as in the case of the 
geographical diffusion across language frontiers in Europe of uvular /r/ as a replace- 
ment for apical /r/, extending from France as far as Norway, where the spread 
is still ongoing (see Trudgill 1973; Chambers & Trudgill 1998). Although the change 
from apical to uvular /r/ is simply replacive, it is also true that uvular articula- 
tion types are much rarer than alveolar. Of the 317 languages presented in 
Maddieson (1984: 32), 99.7 percent have dental/alveolar stops, while only 14.8 
percent have uvular stops. And indeed only approximately 17 percent of the 
languages described in Maddieson (1984) have any uvular articulation-types of 
any kind. 

The suggestion here of course is not that contact produces inherently complex 
features, but that because such features are marked, languages are relatively unlikely 
to produce them internally, and so where they do occur, contact is relatively likely 
to have been involved. 

For example, Mithun (1999: 317) describes a small Sprachbund where members 
of four different language families are in contact: Lake Miwok (Utian)’; Wintuan; 
Pomoan; and Wappo. Wappo is an isolate, but the crucial point is that the Clear 
Lake members of the other three families differ significantly from other members 
of the same family spoken in different areas. Lake Miwok, for example, “differs 
strikingly in its phonological inventory from its relatives.” Miwok languages 
generally have only one series of stops, but Lake Miwok is considerably more 
complex, having added aspirated, voiced and ejective stop series, as well as four 
additional affricates, plus /r/ and /1/. Importantly, 30 percent of the lexical items 
having these articulations can be shown to be loans from neighboring languages. 

We can thus take seriously the sociolinguistic-typological hypothesis that 
languages spoken in high-contact societies are relatively more likely to have 
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relatively more complexity, not just in terms of having more features such as gram- 
matical categories and articulation types, but also in terms of possessing features 
which are inherently more complex. 


4 Contact and Simplification 


Linguistic contact, then, can clearly bring about complexification. But the paradox 
here is that contact can also, and just as clearly, lead to simplification. This is 
very well known, and is a point which has been made by many writers. Pidgins, 
creoles, and creoloids (Trudgill 1996) are all widely agreed to owe their relative 
structural simplicity to language contact; and koines to dialect contact (Trudgill 
1986). 

And contact-induced simplification has often been noted in the historical 
linguistics literature. For example, Milroy (1992: 203) writes of the trend toward 
simplification in the transition from Old English to Middle English that “it seems 
clear that such a sweeping change is at least to some extent associated with 
language contact” (emphasis in original). The Norwegian linguist Hans Vogt 
(1948: 39), quoted in Harris and Campbell (1995: 133), says that “on observe 
souvent qu’une langue ... perd des distinctions formelles, dans des circonstances 
qui rendent l’hypothése d’influence étrangére assez naturelle.” Sankoff (2002: 
657) notes that, according to Bokamba (1993), multilingual language contact 
situations “may result in morphological simplification” where a language is used 
as a lingua franca. 

More recently the link between contact and simplification has been very ably 
demonstrated quantitatively by Kusters (2003). Kusters examines the history 
of degrees of complexity and simplicity in verbal inflectional morphology in 
Quechua, Swahili, Arabic, and the Scandinavian languages. He measures sim- 
plification in varieties of Arabic, for example, in terms of the degree to which dif- 
ferent varieties have undergone developments such as loss of dual number and decrease 
in allomorphy. His highly detailed quantitative analyses lead him to conclude that 
“the level of [linguistic] complexity is related to the type of speech community” 
(2003: 359) in that language varieties with a history of higher contact also tend to 
demonstrate higher degrees of simplification. 

Kusters also discusses the much greater simplification, notably deflexion, which 
has been undergone by the continental Scandinavian languages as opposed to the 
insular Scandinavian languages. He ascribes this, as many others have done before 
him, to heavy and prolonged contact of the continental languages with the Low 
German of the Hanseatic League. This loss of complexity is, for instance, noted 
by Askedal, who observes that Icelandic and Faroese have “retained most or at 
least a fair amount of the morphological characteristics of Old Norse,” whereas 
the continental languages “have gone through a process of morphological change 
(simplification)” (2005: 1872). 

Norde (2001) illustrates this type of change in terms of nominal deflexion in 
Swedish. She talks of the “devastating effects” (2001: 242) of loss of inflection, and 
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contrasts the morphology of the Old Swedish masculine noun fisker ‘fish’ with its 
Modern Swedish equivalent fisk (cf. Wessén 1958: 85, 105): 


OLD SWEDISH 


sg nom indef _fisker sgnomdef _fiskerin 
sg gen indef fisks sg gen def fisksins 
sg dat indef fisk(i) sg dat def fiskinum 
sg acc indef fisk sg acc def fiskin 
pl nom indef _fiskar plnom def _fiskanir 
pl gen indef fiska pl gen def fiskanna 
pl dat indef fiskom pl dat def fiskomin 
pl acc indef fiska pl acc def fiskana 
MODERN SWEDISH 

sg indef fisk sg def fisken 
pl indef fiskar pl def fiskarna 


In Old Swedish, the noun had 15 different forms, while in Modern Swedish it 
has only 4. 

We should not be surprised that simplification seems especially likely to attack 
inflection. Dahl (2004: 111) tells us that “in the class of mature linguistic phenomena, 
we find that the most obvious one is inflectional morphology,” where mature phe- 
nomena represent “evolutionary complexity” (2004: 105) and are of a form which 
presupposes “rather long evolutionary chains” (p. 112). 

In Trudgill (1996), and following Miihlhausler’s pioneering work (1977) and 
earlier important work such as that of Ferguson (1959; 1971), I argued that there 
are three crucial, linked, components to the simplification process: 


1 the regularization of irregularities. In regularization, obviously, irregularity 
diminishes, so that, for example, irregular verbs and irregular plurals become 
regular, as in the development in English of helped rather than holp as the 
preterite of help; and the replacement of kine by cows as the plural of cow. The 
reduction in allomorphy in Arabic investigated by Kusters also comes into this 
category. 

2  anincrease in lexical and morphological transparency: for example, forms such 
as twice and seldom are less transparent than two times and not often, and any 
(partial or complete) replacement of the former by the latter would represent 
simplification. Of course forms such as cows are also more transparent or 
analytic, and iconic, than forms like kine. And Kusters’ Arabic allomorphy reduc- 
tion once again comes into this category. 


These two factors are often linked. A good example of this is provided by the 
present tense forms of strong verbs in modern Norwegian. In Old Norse, these 
forms were monosyllabic and were derived from the base form by i-umlaut. This 
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type of system survives in most Norwegian dialects today, where irregular forms 
such as komme- kjem ‘to come-come/comes’, sove-sov ‘sleep’, and grave-grev ‘dig’ 
are usual. However, in the populous area around Oslo and along the well- 
trafficked southern coast, regularization has taken place, as in Swedish and 
Danish. In these areas simpler, more regular present tense forms such as kommer, 
sover, graver are found instead. Notice, too, that they are also more transparent, 
in the sense that they are easily analyzable into a verb stem and the present tense 
morpheme -er. The result is a verb morphology which is much more regular, and 
where as a result of having identical stems in all forms, transparency is greatly 
increased. 


3 the loss of redundancy. All languages contain redundancy, which seems to 
be necessary for successful communication, especially in less than perfect, 
i.e. normal, circumstances. But it is probable that some languages have more 
redundancy than others. Redundancy, and therefore loss of redundancy, 
takes two major forms. The first occurs in the form of repetition of information, 
or syntagmatic redundancy (Trudgill 1978), as for example in grammatical 
agreement, where there is more than one signal that, say, a noun phrase is 
feminine; or in obligatory tense marking, as for example when all verbs in a 
past-tense narrative are marked for past tense. Here, reduction in redundancy 
will take the form of reduction of the number of repetitions, as in the loss of 
agreement. 


For example, adjectives do not receive a plural ending when used predicatively 
in the Norwegian dialect of Bergen, e.g. vi er trott ‘we are tired’ as opposed to 
plural trette in other dialects. Jahr (1998) points out that this system of adjective 
inflection is simpler than in most other Norwegian dialects that do have plural 
marking on predicative adjectives; and since simplification of the grammar is one 
of the possible outcomes of contact, he argues that it is likely that this change in 
the agreement system, involving loss of redundancy, is due to intensive contact 
with Low German, which Bergen experienced more than any other area of 
Norway. 

The second type of redundancy loss involves the loss of morphological categories. 
Sometimes loss of the morphological expression of grammatical categories is com- 
pensated for by the use of more analytical structures, as in usage in Modern English 
of prepositions instead of the dative case of Old English. (Analytical structures 
are also obviously more transparent than synthetic ones.) And sometimes loss is 
just loss, as in the loss of dual number in Arabic referred to by Kusters. 

A good example of this latter type is the loss of grammatical gender. 
Grammatical gender disappeared from English without, apparently, this loss 
having had any structural consequences. 

In the Scandinavian languages, according to Haugen (1976: 288), the three gram- 
matical genders masculine, feminine, and neuter are preserved “in the overwhelming 
majority of Scandinavian dialects down to the present.” However, a number of 
varieties have reduced the number of genders to two. These are standard Danish 
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and Swedish, and a number of Danish and Swedish dialects, as well as Bergen 
Norwegian. 

A number of writers have also ascribed this simplification to language and/or 
dialect contact. Pedersen (1999) suggests that the loss of the distinction between 
masculine and feminine in Copenhagen, and hence in Standard Danish, is due to 
contact between the dialects of Zealand and (the now Swedish) Scania, and, at 
the time of the supremacy of the Hanseatic League, Low German. And Jahr (1995; 
2001) arrives at the same conclusion concerning the loss of the masculine/ 
feminine distinction in Bergen — but nowhere else in Norway apart from areas 
of Sami-Norwegian language contact: it is the result of “heavy influence of lan- 
guage contact between Norwegian and Low German” (2001: 100). Indeed, he 
ascribes the entire typological split between the continental and the still highly 
fusional insular Scandinavian languages — Askedal (2005: 1872) says “the modern 
languages represent two distinct typological groups” — to the absence of Low 
German contact with Faroese and, especially, Icelandic; and Haugen (1976: 313) 
does the same. 

I do not treat phonological simplification here, but the above discussions 
would suggest that this will include processes such as reduction in size of 
phoneme inventories, including especially loss of marked articulation types, and 
loss of allophony, as well as other processes such as loss of tone and reduction 
in phonotactic possibilities: pidgin languages typically have CVCV structure and 
relatively small phoneme inventories, and lack contrastive tone. 


5 The Conundrum 


It would therefore, it seems, be a mistake to suggest that language contact causes 
either simplification or complexification; it clearly produces both. This has caused 
considerable bewilderment in the literature, including in my own writing. For 
example, Heine and Kuteva (2005: 258) oppose my claims (Trudgill 1983) about 
contact leading to simplification, as in pidginization, and they write that the con- 
tact situations they have investigated “tend to lead not to the reduction and loss 
of existing grammatical categories, but rather to diversification and to the creation 
of new grammatical categories,” i.e. to additive change as studied by Nichols. And 
they write of pidgin and creoles that “it may seem surprising that pidgins and 
creoles did not figure more prominently in the present work.” Equally, in 
Trudgill (1983) I do not acknowledge that contact can lead to complexification. 

Similarly, in Trudgill (1983) I discussed the development of double-definite mark- 
ing in Norwegian and Swedish, as in Norwegian: 


(2) Den store bok-a 
‘The big book-the’ 


This is an instance of the sort of development which is typical of the redundancy 
which does not develop in high-contact languages like creoles (which is correct), 
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while Dahl (2004: 282) maintains that it is precisely “a prime example of a 
contact phenomenon in that it shows up in the intersection of two spread areas, 
one of the preposed and one of the suffixed article” (which is also obviously 
correct). 

Thomason, however, is not bewildered. She simply notes that there is a com- 
mon proposal in the literature is that “contact-induced change leads to simplification, 
not complication” (2001: 64) but that “the opposite claim has also been made — 
namely, that interference always complicates the system” (p. 65). She points 
out that Givon (1979) claims that contact-induced change simplifies grammar, 
while Bailey (1977) claims that “interference always complicates the grammar” 
(Thomason 2001: 96). And then she says that “all the examples that support the 
claim that interference leads to simplification are of course counterexamples to 
the opposite claim” (2001: 65). Harris and Campbell (1995: 133), too, note the “struc- 
tural simplification” claim, but also say that “there are clear counterexamples.” 

The basic sociolinguistic-typological conundrum is therefore that we have to 
determine which language-contact-linked societal factors lead to complexification 
and which to simplification. 

This is in fact not too difficult to do. The key lies in the nature of language learn- 
ing and acquisition. 


6 The “Critical Threshold” 


Anttila says that “language-learning situations, in general, are responsible for vari- 
ous simplifications” (1989: 189); but it turns out that it is very much a matter of 
who does the learning, and under what circumstances. In Trudgill (1983: 106) I argued 
that “it is legitimate to suggest that some languages actually are easier for adults 
to learn, in an absolute sense, than others” and that these “easier” languages are 
more analytical languages “which have experienced more contact.” The clue to 
our conundrum lies in the phrase: “easier for adults to learn.” The point is that 
whenever adults and post-adolescents learn a new language (Trudgill 1993), 
pidginization can be said to occur; and pidginization includes as a crucial com- 
ponent the process of simplification.* This in turn — although other factors such 
as motivation may be involved — is due to the difficulty human adults face in 
learning new languages perfectly. 

This inability is due to the fact that adult and adolescent humans are speakers 
who have passed the critical threshold (Lenneberg 1967) for language acquisition. 
The critical threshold is the most fundamental instrumental mechanism involved 
in pidginization, and it has to be considered to play an important part in our under- 
standing of language contact phenomena. Lenneberg’s term, whether or not one 
accepts that “threshold” is the most appropriate metaphor for what is most likely 
a gradual tailing off, refers to the well-known fact, obvious to anyone who has 
been alive and present in normal human societies for a couple of decades or so, 
that while small children learn languages perfectly, adults do not. Dahl (2004: 101) 
writes of “children’s ability to learn large amounts of low-level facts about 
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language.” He also claims (p. 294) “human children indeed seem to have an 
advantage compared to... adult members of their own species.” And, as Trask 
(1999: 63) has it, “young children learn perfectly any language to which they are 
adequately exposed . . . [while] few adults can perform the same feat.” Trask then 
goes on to say (1999: 64) that “strong support for Lenneberg’s hypothesis comes 
from the observation of feral children who...have been denied normal access 
to language in early life” and who fail to learn language after being first exposed 
to human contact as teenagers. And there is also excellent evidence for this from 
the deaf community, with age of exposure to Sign Language being closely related 
to success in its acquisition (Hyltenstam, 1992). 

The view from sociolinguistics (e.g. Labov 1972) is that children acquire new 
dialects and languages more or less perfectly up to the age of about 8, and that 
there is no or very little chance of them learning a language variety perfectly 
after the age of about 14. What happens between 8 and 14 will depend very much 
on the circumstances and on the individual. Although it must be the case that 
sociological and sociopsychological factors are also partly responsible for the 
relatively poor language-learning abilities of adults in natural acquisition situ- 
ations, developmental factors play an obvious and vital role. 

Some linguists are a little cautious about accepting this common-sense position: 
Anderson and Lightfoot are simply willing to say (2002: 209) that “whatever 
we learn after the period of normal first-language acquisition, we learn in a dif- 
ferent way.” 

But, beginning in the 1970s, there were also, surprisingly, more strongly dis- 
senting voices, both from Second Language Acquisition studies and formal 
linguistics. (A recent example is Hale, (2007: 44), who says that “I do not believe 
in what is sometimes called the Critical Age Hypothesis.”) Swan (2007) has 
argued, in a somewhat different context, that this kind of reluctance to accept 
the obvious was explicable in terms of the linguistic-ideological position taken 
by some linguists who were “concerned to show, in accordance with the new 
orthodoxy of the time, that all language development was driven by unconscious 
mechanisms whose operation was similar, if not identical, for both L1 and L2” 
and that any position which was not compatible with this view “needed to be 
discredited.” 

The role of the critical threshold - or at least of the age of the learner - in 
producing simplification is supported by Kusters (2003: 21), who in making 
the point that inflectional or fusional morphology is implicated in complexity, 
crucially specifies that it is “outsider complexity” he is referring to. Here an 
“outsider” is “a second language learner, who is not acquainted with the speech 
community of which s/he is learning the language, and who wants to use the 
language to transmit meaningful messages” (2003: 6, my italics). And of course 
it is the fact that we are dealing with post-threshold adult second-language 
outsider-learners here that is the crucial one. Dahl (2004: 294) uses the term 
“L2-difficult.” 

At this point we may return once again to the approach of Nichols. As we 
have seen, Nichols argues that her morphological complexity index “correlates 
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with overall morphological complexity and hence can be used as an index of 
something real.” But it will be remembered that she concedes that it “overlooks 
a good deal of the actual morphological complexity of languages . . . in consider- 
ing where and whether something is marked but not how (e.g., by inflection, agglu- 
tination, cliticisation, or incorporation)” (1992: 64). 

It is obviously the case that absence of marking, scoring 0 in Nichols’ system, 
represents maximum simplicity, as with the loss of English gender. However, 
marking by inflection as opposed to agglutination or detached marking (e.g. 
pronouns, clitics), although both scoring 1 according to Nichols, actually differ 
in terms of their complexity: inflexional marking is the more complex because it 
is less transparent. 

Indeed, Kuspers cites Clahsen and Muysken (1996), and Meisel (1997), as 
demonstrating that L2 learners have serious problems with inflection, and that 
here is a point where “adults have clearly more learning difficulties than L1 
learners” (p. 48). Eubank (1996) also argues that the capacity to learn inflection 
in language is innate, but that for adults and adolescents this capacity is no 
longer accessible. And Dahl writes — remembering that he describes inflection 
as the archetypical exemplar of a “mature” linguistic phenomenon -— that “there 
is a significant overlap between the mature features listed [above] and those 
linguistic features that are most recalcitrant in second language learning” (Dahl, 
2004: 286). 

Kusters then goes on to adduce (2003: 21) three principles of inflection in con- 
nection with verbal morphological complexity which are relevant to this discus- 
sion, and which help to elucidate why inflection is so “recalcitrant.” The first 
principle is the Economy Principle, which states that “as few semantic categories 
or category combinations as possible should be expressed morphologically.” The 
Transparency Principle “demands that the relation between form and meaning is 
as transparent as possible.” The highest level of transparency or analyticity is when 
“every single meaning is expressed in a separate form.” And the Isomorphy 
Principle states that the affix order should be “isomorphic to the order of mean- 
ing elements.” 

It is clear that agglutinating languages are less complex in terms of “outsider 
complexity” for post-threshold learners in that, at least, these languages adhere 
to the transparency principle much more closely than fusional languages, and are 
more analytic. To take a textbook example, Steinbergs (1996: 381) says that an agglu- 
tinating language “has words which can contain several morphemes, but the words 
are easily divided into their component parts . . . Each affix is clearly identifiable 
and typically represents a single grammatical category or meaning.” She then gives 
Turkish examples such as koj-ler-in ‘village-plural-genitive’. However, in a 
fusional language affixes “often mark several categories simultaneously” — they 
have “several simultaneous functions.” She cites examples from Russian such 
as ruk-u ‘hand-feminine + singular + direct object’. According to Anderson, 
inflectional languages “have internally complex words which cannot easily be 
segmented into an exhaustive and non-overlapping string of formatives” 
(Anderson, 1985: 8). And as Andersson asserts: 
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in absolute terms one could say that analytic languages are easier than synthetic lan- 
guages, and there are two arguments for this claim. Firstly, children always learn a 
more analytic version of their native language; inflectional and derivative suffixes 
are learned later on. Secondly, pidgin languages from around the world are typic- 
ally analytic. (Andersson 2005: 46) 


7 Two Types of Contact 


Now the picture becomes clearer. Simplification in language contact does not result 
from nonnative language learning as such, but from post-critical threshold non- 
native language learning. As Labov (2007: 382) says of contact phenomena he has 
studied: they “share the common marker of adult language learning: the loss of 
linguistic configurations that are reliably transmitted only by the child language 
learner.” Simplification will occur in sociolinguistic contact situations only to 
the extent that adult second-language learning occurs, and not only occurs but 
dominates. 

I have argued (1983) that we have become so familiar with this type of sim- 
plification in linguistic change — in Germanic, Romance, Semitic — that we may 
have been tempted to regard it as normal — as a diachronic universal. However, 
it is probable that 


widespread adult-only language contact is a mainly a post-neolithic and indeed 
a mainly modern phenomenon associated with the last 2,000 years, and if the 
development of large, fluid communities is also a post-neolithic and indeed mainly 
modern phenomenon, then according to this thesis the dominant standard modern 
languages in the world today are likely to be seriously atypical of how languages 
have been for nearly all of human history. (Trudgill 2000) 


Nichols (2007: 176) agrees that contact “may well have been rare in prehistory 
though it is responsible for much reduction in morphology in Europe over the 
last two millennia.” 

If we define linguistic complexity as being related to’ the difficulty of acquisi- 
tion of a language, or a subsystem of a language, for adolescent and adult learners 
(Trudgill 2009), then it is clear why simplification, at its most extreme in the devel- 
opment of pidgins, takes the form it does. Post-threshold learners have difficulty 
in coping with irregularity and opacity; redundancy adds to the burden for 
learner-speakers. Highly irregular and nontransparent features are harder to 
learn and remember: arbitrariness in grammar produces material which has to 
be learned without any generalization possible. And high redundancy means 
that there is more to learn (see Bakker 2003). Indeed it does seem reasonable to 
characterize complexity in this way. As Andersson says: 


The terms simple and complex rather refer to structural aspects of the language: a 
simple language has fewer rules, paradigms and grammatical forms than a complex 
one. Furthermore, a simple language (in this structural sense) is easier to learn (in 
terms of time and effort) than a complex language. (Andersson, 2005: 40) 
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Speech communities which have frequent contacts with other societies that 
involve adult dialect or language acquisition are therefore relatively more likely 
to produce languages and dialects which demonstrate simplification,’® the most 
extreme, though least usual cases typologically, being pidgins and creoles. Changes 
such as a move from synthetic to analytic structure, reduction in morphological 
categories and grammatical agreement and other repetitions, increase in regular- 
ity, and increase in transparency, make for greater ease of adult learnability. 

This now, in turn, helps us to appreciate under which circumstances contact 
will lead to the reverse process, complexification. Contact leading to com- 
plexification will also be of a particular type. We can expect to see additive com- 
plexity developing in stable, long-term, co-territorial contact situations which involve 
childhood — and therefore pre-threshold and proficient — bilingualism. It is this 
kind of situation which gives rise to the phenomenon of the Sprachbund: “strong 
linguistic areas are typically characterised by large numbers of small linguistic 
communities on good social terms [my italics]. Their members are in frequent 
contact and often become multilingual” (Mithun 1999: 314). And the length of time 
that may be involved in this kind of co-territorial contact and bilingualism, some- 
times stretching back thousands of years, is illustrated in what Mithun has to say 
in connection with her work on Californian languages (2007: 146): 


California is home to tremendous genetic diversity. The most ambitious reductive 
hypotheses have grouped the languages into seven possible genetic units ... Yet we 
see striking parallelisms in abstract grammatical structure which cross-cut genetic 
lines. The languages are characterized by pervasive, often elaborate sets of means/ 
manner prefixes and locative/directional suffixes, structures that are relatively rare 
outside of the area. The situation is strongly suggestive of transfer through language 
contact. Relatively little is known for certain of the prehistory of the California groups 
before their contact with Europeans, but it is clear that there was an extensive period 
of intense social contact, multilingualism, and intermarriage in this area. 


It is clearly also this type of (very) long-term contact that Heine and Kuteva are 
mainly interested in in their book: they write (2005: 5) “contact-induced language 
change is a complex process that not infrequently extends over centuries, or even 
millennia.” And it is also this kind of situation which occurs in Nichols’ residual 
zones where complexification as a result of additive borrowing is very typical. 
As we saw, Nichols cites as being residual zones areas such as: the Pacific North 
West, which includes the Clear Lake Miwok language discussed by Mithun (see 
above) as being considerably more complex than related languages; and the 
Caucasus. Haudricourt (1961) cites the (now extinct) Caucasian language Ubykh 
as having an enormously complex phoneme inventory including 78 consonants, 
while Vogt (1963) gives the number of consonants as 80. 


8 Conclusion 


To sum up, sociolinguistic-typological considerations lead us to suppose that 
typically, and on average, and allowing for the fact that this is a very simplistic 
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version of what actually happens in human societies, and that all of the categories 
I am employing represent continua which permit of many degrees of more-or- 
less, the following is the case: 


1 high-contact, long-term pre-critical threshold contact situations are more 
likely to lead to additive (and only additive) complexification; 

2  high-contact, short-term post-critical threshold contact situations are more likely 
to lead to simplification; 

3 low contact situations are likely to lead to preservation of existing complexity. 


And this of course leads to the fascinating question of whether it is also the case 
that it is low-contact situations that are more likely to lead to the spontaneous 
production of nonadditive complexity (Trudgill 2009; forthcoming a; forthcom- 
ing b.). But that is another story. 


NOTES 


Very many thanks for their help with this paper to Anders Ahlqvist, Enam Al-Wer, 
Stephen R. Anderson, Lars-Gunnar Andersson, Peter Bakker, David Britain, Jenny Cheshire, 
Harald Clahsen, Greville Corbett, Peter Culicover, Osten Dahl, Jan Terje Faarlund, Paul 
Fletcher, George Grace, Martin Haspelmath, Bernd Heine, Arthur Hughes, Ernst Hakon 
Jahr, Brian Joseph, Ove Lorentz, Mike Garman, Marianne Mithun, Miriam Meyerhoff, 
James Milroy, Lesley Milroy, Dennis Preston, Nikolaus Ritt, Michael Swan, and Henry 
Widdowson. 

1 Outside this area, front rounded vowels also occur in Turkish — and Albanian has /y/. 

2 These parameters are by no means independent: other things being equal, small, stable 
communities are likely also to have a high degree of sahred information, for instance. 

3 Ido not discuss further here the important issue of what complexity can or might 
mean (though see below), but obviously the notion itself is very complex — see for 
example Kusters (2003), Dahl (2004), Hawkins (2006), Miestamo, Sinnemaki, and 
Karlsson (2008), Sampson, Gil, and Trudgill (2009), Trudgill (2009). In this paper, as 
will be seen, I discuss two different facets of complexity, but obviously there are many 
more. 

4 Noun possessor, pronoun possessor, and modifying adjective in NPs; noun subject, 
direct object, and indirect object in clauses; pronoun subject, direct object, and indirect 
object in clauses. 

5 This is naturally just a sketch. The full details are given in Aikhenvald (2003: 
287-323). 

6 Languages which have lost front rounded vowels in historical times include English, 
and Greek. 

7 This is a different language from the Southern Sierra Miwok discussed by Nichols. 

8 Pidginization is a common process which only very rarely, and in very unusual cir- 
cumstances, leads to the development of a pidgin language (see Trudgill 1996; 2000). 

9 But not necessarily identical with —- see Dahl’s (2004: 39) discussion of complexity 
versus cost and difficulty. 
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10 Space does not permit discussion here of how the effects of simplification resulting 
from adult language-learning can eventually find their way into the language as 
spoken by native speakers, as in Bergen Norwegian and Copenhagen Danish, but 
language shift and demographic factors clearly have to be considered (see Trudgill 


1996, on creoloids). 
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16 Contact and 
Language Death 


SUZANNE ROMAINE 


1 Introduction 


Researchers often classify situations of language contact by their outcomes. 
When groups in contact need to communicate, they have a number of possible 
choices. One is to use a lingua franca they both share (which may be an existing 
pidgin or lead to the creation of one, or some other language of wider commu- 
nication). A second option is for one or more parties to learn the other group’s 
language(s). In cases involving no substantial imbalances of power between the 
groups, stable multilingualism may result. However, where bilingualism is 
asymmetrical and the more powerful group imposes its language on a subordin- 
ate group, contact often leads to language shift or loss. This chapter deals with 
some of the causes and outcomes of shift culminating in language death. 


2 Causes of Language Death 


Although in most instances the proximate cause of language death is language 
shift, the primary causes of language shift and death are not themselves lin- 
guistic. People do not normally give up their languages willingly, but continue 
transmitting them, albeit in changed form over time. Language shift and death 
occur as responses to pressures of various types (social, cultural, economic, and 
even military) on a community. Language shift involves a loss of speakers 
and domains of use, both of which are critical for survival of a language. The 
possibility of impending shift appears when a language once used throughout a 
community for everything becomes restricted in use as another language intrudes 
on its territory. Typically the imposing language prevails in all areas of official life, 
e.g. government, school, and media, necessitating bilingualism on the part of 
the subordinate group. Usage declines in domains where the language was once 
secure, e.g. in churches, the workplace, schools, and most critically, the home, as 
growing numbers of parents no longer transmit their language to their children. 
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Eventually, the dominant language tends to invade the inner spheres of the 
subordinate language, so that its domains of use become even more restricted. 
Fluency in the language increases with age, as younger generations prefer to 
speak the dominant language because it is tied to socioeconomic advancement. 
The linkage between the dominant language and social mobility, along with the 
prestige of the dominant language and its predominance in public institutions, 
also leads the community to devalue their own language, culture, and identity 
as part of a process of symbolic domination. 

As an example consider what happened in Ireland as historical and political 
factors led to a decline in the use of Irish, chief among them, population loss, loss 
of political autonomy, and cultural and physical dislocation. Before the seventeenth 
century the majority of the population overwhelmingly spoke Irish, and English 
was dominant only in a small eastern region around Dublin. By 1851, however, 
Irish was almost absent from the eastern half of the country, and was losing ground 
among young people everywhere except the far western margins. Before the great 
famine lasting from 1845 to 1849, Irish ranked comfortably within the top 100 of 
the world’s 7,000 or so languages in terms of number of speakers (Romaine 2008). 
The famine killed around 1 million and led to mass emigration of another 1.5 mil- 
lion. By 1900 these losses reduced the population by more than half. 

Some of the factors responsible for the decline of Irish are now affecting other 
languages on a scale hitherto unanticipated. Indeed, we can see what happened 
to Irish and all the Celtic languages (such as Breton, Manx, Welsh, Scottish 
Gaelic, and Cornish) as an early example of a process now being played out on 
a global scale, with English, French, Spanish, and Chinese spreading at the 
expense of the world’s many thousands of small, largely rural, vernaculars. As 
large languages expand, small ones contract. Although language contact need 
not imply language shift or death, this scenario in which intense pressure from 
a dominant group leads to asymmetrical bilingualism among subordinate 
groups, resulting sooner or later in language shift, is seen by Thomason (2001: 9) 
as typical of language contact. Language shift is thus symptomatic of much 
larger-scale social processes that have brought about the global village phe- 
nomenon, affecting people everywhere, even in the remotest regions of the 
Amazon. Some linguists predict that between 50 and 90 percent of the world’s 
6,900 languages will disappear over the next century (Nettle & Romaine 2000). 
This alarming figure does not include dialects because no one knows how exactly 
many there are due to the lack of clear linguistic criteria for distinguishing 
between language and dialect (see Wolfram & Schilling-Estes 1998 for discussion 
of dialect endangerment). 

Not coincidentally, the vast majority of today’s threatened languages are found 
among socially and politically marginalized and/or subordinated national and 
ethnic minority groups within nation-states where the politics of nation-building 
gave precedence to dominant ethnic groups. Estimates of the number of such groups 
range from 5,000 to 8,000 and include among them the world’s indigenous 
peoples, who comprise about 4 percent of the world population but speak up to 
60 percent of its languages. The fate of most of the world’s linguistic diversity 
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lies in the hands of a small number of people who are the most vulnerable to 
pressures of globalization (Nettle & Romaine 2000). 


3 Sudden versus Gradual Death and Its 
Linguistic Consequences 


The precise trajectory of language shift and its linguistic consequences differ some- 
what among and within different groups, depending on a number of factors, includ- 
ing for example, length and intensity of contact. In the US, for example, Spanish 
and English have been in contact since the sixteenth century, but Navajo and English 
have been in contact for only about 150 years. Although contact travels along a 
potentially two-way street, where more than one language exists in a commu- 
nity, they are rarely equal in status. Languages and language varieties are always 
in competition, and at times in conflict. Knowledge of the varied sociolinguistic 
histories of relationships between speakers and groups within a contact situation 
is key to understanding the direction and extent of influence of one language on 
another. Borrowing typically follows a path from a prestige to a nonprestige 
language. Thus, in the US the incorporation of English features into Spanish is a 
linguistic reflex of the dominance of English, while in Mexico and other parts of 
Latin America the intrusion of Spanish elements into indigenous languages like 
Nahuatl and Quechua indexes the role of Spanish as the language of social 
mobility and political power. Pennsylvania German in the US shows strong 
impact from English, while the English spoken by the same community remains 
fairly intact from German influence (Burridge 2006). The fact that Navajos are shift- 
ing to English rather than Anglos to Navajo means that virtually all Navajos are 
exposed to English, but few Anglos (and not even all Navajos) are exposed to 
Navajo. This is reflected in the minimal effect of Navajo borrowing in English. 


3.1 Sudden death 


An initial distinction between the sudden and gradual disappearance of a lan- 
guage is helpful in understanding some of the linguistic consequences of language 
death. In sudden death a language dies more or less intact as its speakers are 
exterminated often as a result of a natural disaster or genocide. An example 
of the former is the death of all known speakers of Tamboran due to a volcanic 
eruption in 1815 on the island of Sumbawa in the Indonesian archipelago. A case 
of sudden or near sudden death caused by genocide is Yahi, last spoken by a man 
known as Ishi, believed to be the last survivor of his tribe, murdered and driven 
into exile by white settlers in California (Kroeber 1964). 

No one knows how many languages have vanished under similar circumstances 
without having been recorded. A number of languages have been passed down 
only in a few word lists written down by doctors, surveyors, clergymen, and 
others acting as amateur linguists. The only remnants of Tamboran are contained 
in a short word list collected by Sir Thomas Raffles. In another case from 
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California Elmendorf (1981) found a single last speaker for each of two languages 
(Wappo and Yuki), but neither was in active use and no young people were acquir- 
ing either language. By the mid 1960s Laura Fish Somersal was the last person 
able to carry on a conversation in Wappo. She used the language several times a 
week when her sister visited her. The language survived so long with Somersal 
because she did not go to school, where she would have been exposed to more 
English. She remained at home to care for her blind mother, with whom she used 
Wappo. By contrast, Arthur Anderson, the last person to remember Yuki, had 
been schooled in English and had long since shifted to that language for every- 
day use. He had not spoken Yuki since 1908. 

When only one or a few speakers are left who no longer use the language regu- 
larly, they may not remember it well enough to allow linguists to reconstruct 
what the language was like in its healthier days. Although anthropologists 
Alfred Kroeber and Thomas Waterman and linguist Edward Sapir worked to docu- 
ment the Yahi language in spoken and written form, much remains unknown. 
We will never know the extent to which Ishi was representative of the last gen- 
eration of Yahi speakers. When Dixon (1972) first worked with the approximately 
30 remaining fluent Dyirbal speakers in North Queensland, Australia, 20 tradi- 
tional kin categories were indicated in the language, but it would now be imposs- 
ible to figure out the system on the basis of evidence obtained from the younger 
generation (Dixon 1991). Gros Ventre speech forms were once differentiated into 
male and female varieties, but now only the female forms have survived because 
the language has not been spoken regularly for some decades on the Fort 
Belknap Indian reservation in Northern Montana (Taylor 1989). 

When a language is highly stigmatized, many are reluctant to admit that they 
speak it. Writing of Scottish Gaelic speakers who emigrated to Cape Breton, Nova 
Scotia, Mertz (1989: 12) remarks that young people’s denials of any knowledge 
of Gaelic represent attempts to deny an image of themselves as poor or lower- 
class. As knowledge of English was required for assimilation to and social mobil- 
ity within mainstream Canadian English-speaking society, the symbolic linkage 
between Gaelic, rural “backwardness” and economic hardship propelled language 
shift. Some stop speaking their languages out of self-defense as a survival strat- 
egy. Consider El Salvador in 1932, when after a peasant uprising Salvadoran 
soldiers rounded up and killed anyone identified as Indian either by dress or 
physical appearance. Some 25,000 were killed in this way. Even three years later 
radio broadcasts and newspapers were calling for the total extermination of the 
Indians to prevent another revolt. Many people stopped speaking their lan- 
guages to avoid being identified as Indian in order to escape what they feared 
was certain death. This led to the eventual extinction of some languages like 
Cacaopera (Campbell & Muntzel 1989: 183). It is not always possible to locate all 
the remaining speakers in a dwindling and sometimes scattered population. 

These examples illustrate some of the reasons why linguists often find it 
difficult to decide when the last speaker of a language has died; indeed, in some 
cases the only remaining person to know a language may be a linguist who worked 
with the last remaining native speakers and survived their death. Due to the 
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difficulty in pinpointing precisely the absolute end of any language, the distinc- 
tion between sudden and gradual death can be blurred. Moreover, a language 
may have effectively disappeared from active everyday use, without being com- 
pletely forgotten by all of its former speakers. “Rememberers” may survive the 
active use of a language by several generations. Sometimes such people recall things 
from a language they never fully learned or used. Campbell located several old 
men in El Salvador in 1974 who could remember a few words and phrases of 
Cacaopera. Only two men had learned more than a few words, either from a grand- 
mother or grandfather. Many languages survive only in these remembered bits 
and pieces and are no longer regularly used. It is sometimes no longer clear even 
to community members themselves who still speaks or remembers the language 
once more widely spoken. 


3.2. Gradual death 


Gradual death generally takes place over the course of several or more genera- 
tions as the dying language typically goes through a period when it is not used 
for all the functions and purposes it was previously. Even a language once fully 
acquired may recede from active recall if no longer used. Disuse creates a vicious 
circle of attrition. As speakers forget more and more of it, it becomes difficult to 
recall the old words, especially when some of the things they referred to have 
become obsolete because they are related to traditional customs no longer prac- 
ticed. The process of attrition can take place in situ (as in the cases of Dyirbal 
and Gros Ventre) and as well as in immigration contexts, where the language in 
question is still used elsewhere. In both types of settings changes are rooted in 
the transmission process. Traditional community and family structures and prac- 
tices once supporting the transmission of language and culture have weakened. 
Major changes in socialization patterns have made the formerly normal process 
of acquiring languages such as Navajo at home the exception rather than the rule. 
Spolsky (2002) compared the change between 1970 (when 90 percent of children 
entered school as Navajo monolinguals), and 30 years later, when the situation 
was reversed with most children entering school as English monolinguals. 
So-called immigrant bilingualism in countries such as the United States illus- 
trates a similar and very common pattern, where bilingualism is largely a tem- 
porary and transitional stage in intergenerational language shift, propelling a 
community from total monolingual competence in the native language to virtual 
monolingual competence in English. The older generation may be largely mono- 
lingual, not ever acquiring English well, and the youngest generation may like- 
wise be monolingual, but in English rather than the parents’ native language. 
Even where monolingual first-generation parents speak their language at home, 
their children are exposed to English through older siblings and playmates. 
The proficiency of older children in the parents’ language is often greater than 
that of the younger children, but many do not reach native-like proficiency. 
Once they go to school, exposure to the home language often becomes minimal 
and productive skills in the language are severely limited. Thus, by the third 
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(and sometimes even second) generation, immigrants are generally dominant 
in English: the immigrant language, if they can speak it at all, reveals signs of 
incomplete acquisition, attrition, and influence from English. There may be a 
continuum of types of acquisition, attrition, and proficiency occurring even 
within the same family (Gonzo & Saltarelli 1983; Finocchiaro 2004). 


4 Changes Characteristic of Attrition 


The two main causes of change in dying languages, incomplete acquisition and 
declining use, often lead to structural and stylistic attrition. So-called attrition 
studies have been carried out by researchers concerned with the process of lan- 
guage shift and death as well as by those interested in second language acquisi- 
tion. Detailed studies dating from the 1980s have aimed at identifying areas of 
language particularly susceptible to change, the rate at which change occurs, 
and establishing correlations between linguistic change and social variables (e.g. 
age, gender, etc.). Some key questions are whether it is possible to distinguish 
between internally versus externally motivated change, and between changes that 
are universal or specific. 

Dorian’s (1981) study of the remaining speakers of East Sutherland Gaelic in 
Scotland drew attention to some of the kinds of deviations that other researchers 
have since documented in other cases. Dorian (1977) originally gave the term “semi- 
speaker” to individuals who failed to develop full fluency and normal adult 
proficiency. Some spoke the language, but with deviations from fluent older 
speakers. Others seldom spoke the language, but nevertheless had good passive 
competence. Semi-speakers tended to substitute more analytic structures for syn- 
thetic ones, to analogically level irregularities, and to have fewer stylistic options 
or registers. Tsitsipis (1989) identified a category of speakers similar to Dorian’s 
semi-speakers among a group of Albanian emigrants to Greece. Referring to them 
as “terminal speakers,” Tsitsipis (1989: 119, 125) found that their variety of 
Albanian (Arvanitika) was distinguished from that of fluent speakers by virtue 
of heavy lexical loss, loss or confusion of crucial phonological oppositions, and 
simplification of grammatical paradigms through substantial reduction of allo- 
morphy. In addition, terminal speakers relied heavily on formulaic utterances. 

Structural reduction goes hand in hand with stylistic reduction, which is intim- 
ately connected to functional restriction as limited productive competence in a 
dying language forces terminal speakers to depend more and more on fixed phrases 
and less on creative new utterances (Mougeon & Beniak 1991). Stylistic shrink- 
age may proceed from top down (i.e. formal or high registers) or bottom up 
(Campbell & Muntzel 1989: 185). In cases where the minority language is 
restricted to ceremonial or school use, informal, everyday styles may be reduced 
or nonexistent. Alternatively, restriction to the domestic sphere and informal in- 
group settings involving networks of family and friends often results in young 
people’s failure to acquire forms appropriate for more formal contexts. This is 
one reason why second-generation speakers of immigrant languages such as 


326 Suzanne Romaine 


German, French, Italian, and Spanish with so-called T/V systems of address that 
index familiarity and intimacy (e.g. tu in French, Italian, Spanish) versus formal- 
ity and distance (e.g. French vous, Italian Lei, Spanish Usted) tend to overuse the 
familiar forms. The fact that this distinction is not matched in English (which has 
only socially unmarked you) may also be a contributing factor in the overgener- 
alization of familiar forms. Some detailed examples of changes in dying languages 
follow with reference to some typically affected areas of linguistic structure such 
as lexicon, phonology, classifier systems, pronominal systems, case marking, and 
syntax. 


4.1 Lexicon 


Because the disappearance of a minority language and its related culture almost 
always forms part of a wider process of social, cultural, and political displace- 
ment, many of the changes affecting dying languages tend to eliminate much of 
what is culturally distinctive, e.g. vocabulary for local flora, fauna, native tradi- 
tions and knowledge, etc. (Nettle & Romaine 2000). The most complex parts of 
the language requiring the longest time to acquire are often among the first to 
weaken or disappear because children’s acquisition is interrupted or receives no 
support from home or school. When the frequency of irregular and marked forms 
falls below a critical threshold, younger speakers are less likely to acquire them 
as they increasingly use the dominant language. 

In her study of the Dyirbal spoken by younger people, Schmidt (1985) inter- 
viewed the children and grandchildren of the older speakers Dixon (1972) had 
previously worked with. Young people between the ages of 15 and 39 who could 
still speak the language used a variety quite different from the traditional form 
documented by Dixon. While older speakers formerly had names for over 
600 plants, some of the younger people were able to remember fewer than half 
of 500 items of basic and culturally distinctive vocabulary. Less fluent speakers 
were able to recall fewer words than more fluent ones. New words were rarely 
coined due to a lack of base forms to which to apply once productive word for- 
mation rules to create new words. Younger people also tended to lose more specific 
words and replace them with a general one. For instance, there are many words 
equivalent to the English adjective big. To call an eel big, one would say it was 
qunui, but to call a scrub turkey big, one would say it was waqala. Young 
people, however, use only one word meaning ‘big’ to refer to all kinds of big 
things. Younger Dyirbal speakers have also generalized or lost some of the rich 
terminology for referring to local flora and fauna. In traditional Dyirbal eels had 
individual names, but young people used the term iaban, originally referring to 
a spotted eel, to refer to all kinds of eels. 

Another case in point is the erosion of the vocabulary for reindeer and the decline 
of reindeer herding among the Tofa of the Sayan mountains in southern Siberia. 
Reindeer once provided the basis for their traditional economy, but by the end of 
the twentieth century only a single community-owned herd of 400 head survived. 
The youngest (and probably last) of the Tofa reindeer herders is 19-year-old Dmitry 
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A., who speaks only Russian. His father and uncle still speak Tofa and retain the 
traditional vocabulary used to classify reindeer in terms of age, sex, rideability, 
fertility, and tameness. This intricate system allowed herders to describe accurately 
any given reindeer by referring to it with a unique label representing a combina- 
tion of its attributes; déngiir means ‘male domesticated reindeer in its third year 
and first mating season, but not ready for mating’. Dmitry, however, needs to 
explain this in Russian by means of a complex noun phrase listing the individual 
qualities. As Harrison (2007: 26-9) explains, what is lost in translation is efficiency 
of information packaging, and with it, culturally specific knowledge adapted to 
the narrow ecological niche of reindeer herding in the south Siberian forests. 

Much of the world’s so-called traditional ecological knowledge is passed down 
orally and is always only a generation away from extinction. When forests are 
cut and people stop collecting plants for food and medicine, they soon begin to 
forget not only the uses the plants were once put to, but even the names of the 
plants themselves. When asked about the uses of some of the native trees, many 
young people in eastern Indonesia say that they are good only for timber. 
Knowledge about traditional uses of the trees is no longer being passed down 
because the forests are being logged and plantation agriculture is replacing 
cultivation of a variety of native plants and trees. In the Highlands of Mexico, 
people stand in line at a field clinic for the visiting health worker to dispense 
medicine for common ailments they once treated themselves with their abundant 
medicinal plants. Lizarralde (2001), for instance, found that 40 to 60 percent of 
the ethnobotanical knowledge of the Bari-speaking people of Venezuela was 
being lost from one generation to the next. Watson (1989: 52-3) reported a simi- 
lar decline in the knowledge of plant names among traditional practitioners of 
herbal medicine in Ireland and Scotland along with loss of a large percentage 
of distinctive vocabulary relating to the rural lifestyles of the remaining Gaelic 
speakers. Each of a number of traditional livelihoods such as fishing, farming, 
weaving, etc. with its own terminology has been largely supplanted by the modern 
largely urban-based economy. Even where farming continues, new technology and 
related terminology from English have been introduced. 

Meanwhile, talking about new domains connected with the dominant culture 
generally means adopting terms from its associated language, and often switch- 
ing entirely to that language. Haugen (1953: 71) remarked of the Norwegian 
spoken by immigrants to the US that “at practically every point they maintained 
the basic phonetic and grammatical structures of their native dialects, but they 
filled in the lexical content of these structures from the vocabulary of English.” 
In similar fashion, some varieties of Pennsylvania German appear to be heading 
toward something akin to an English lexicon embedded within a structure still 
distinctively German (Burridge 2006). In asymmetrical bilingualism borrowing 
typically exceeds need, and gratuitous borrowing of core vocabulary items is 
common (Romaine, in press). Germans in the US and Australia have borrowed 
fridge (cf. der Kiihlschrank) and shop (cf. der Laden); Italians in the US have adopted 
fence (cf. il recinto) and yard (cf. il giardino), and Spanish speakers in the US have 
borrowed a number of English verbs by incorporating them into a class of 
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infinitives terminating in the special ending -ear, e.g. lunchear (cf. almorzar) ‘to eat 
lunch’, parquear (cf. estacionar) ‘to park’. 


4.2 Classifier systems 


Classifier systems are frequent casualties of language attrition and death. A 
whole range of such systems may be especially vulnerable because they are highly 
concentrated in languages spoken in parts of the world whose ecosystems and 
languages are under severe threat (Nettle & Romaine 2000: 61-9). Moreover, because 
these systems have arisen through long-standing and intimate relationship 
between ways of speaking and cultural practices, they involve a considerable amount 
of complexity of the kind likely to be simplified or lost by attrition processes. 

One example is the disintegration of the fourway system of noun classification 
in Dyirbal. Each noun must be preceded by one of four classifiers. Thus, ‘man’ 
is bayi yara; ‘woman’ is balan jugumbil; ‘vegetable food’ is balam wuju; and ‘tree’ 
is bala yugu. The bayi class includes men, kangaroos, possums, bats, most snakes, 
the moon, etc. The balan class contains women, bandicoots, dogs, anything 
connected with fire or water, sun, stars, etc. The balam class comprises all edible 
fruits (and the plants bearing them), ferns, honey, cigarettes, etc. The bala class 
includes body parts, meat, bees, most trees, mud, stones, etc. The first class obvi- 
ously includes human males and animals, while the second contains human females, 
birds, water, and fire. The third has non-flesh food and the last, everything not 
in the other classes. 

Nevertheless, as in other complex noun classification systems, there is a great 
deal of variation in noun class assignment that cannot be explained on semantic 
or other linguistic grounds, but is dependent on a knowledge of traditional 
myths and cultural beliefs. In Dyirbal there is also a general rule at work that 
puts everything associated with the entities in a category in that particular class. 
This means that fish are in the bayi class with men because they are seen as 
animals, but fishing lines, spears, etc. are also in the same class due to their 
connection with fish. Birds, however, belong to the balan class with other female 
beings because birds are the spirits of dead human females and the moon and 
sun are husband and wife according to Dyirbal myth. The moon goes in the bayi 
class with men and husbands, while the sun belongs with females and wives, as 
does fire which is associated with the sun. One further principle applies. If some 
members of a set differ in some important way from the others (usually in terms 
of their danger or harmfulness), they are put into another group. Thus, while fish 
are in the bayi class with other animate beings, the stonefish and gar fish, which 
are harmful and therefore potentially dangerous, are in the balan class with 
women. 

Younger people are now less familiar with the kind of ancestral knowledge under- 
pinning this system of noun classification, so that less fluent speakers operate with 
a simplified system that classifies all nouns into just two groups, animate and inan- 
imate, with another group for everything that cannot be classed along animacy 
lines. All inanimate nouns such as ‘tree’, ‘table’, etc. are put into the balam class. 
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Thus, ‘tree’ is balam yugu instead of traditional Dyirbal bala yugu. Animate nouns 
such as ‘woman’, ‘kangaroo’, etc. are further divided into males and females, with 
males going into the bayi class, and females into the balan class. The distinction 
between the balam and bala classes has disappeared for the younger less fluent 
speakers, who no longer use the classifier balam for edible fruits and vegetables. 
In the new simplified system everything not belonging to the other classes is put 
into the bala class. 

This reorganization has simplified the basis for allocating items to the various 
classes. The concepts of water, fire, and fighting traditionally associated with the 
balan class have now been lost. Only one basic criterion, femininity, is used for 
categorizing nouns as members of that class. Less fluent speakers classify birds 
with other animate things in the bayi class, while traditional speakers put them 
into the balan class. Less fluent speakers put both ‘sun’ and ‘moon’ into the balan 
class with other inanimate objects, but traditional speakers put the moon in the 
bayi class and the sun in the balan class in accordance with Dyirbal beliefs. 
Younger speakers do not treat harmful and dangerous items like the stonefish 
and stinging nettle as exceptions, but instead put them into the bayi class if they 
are animate, and the bala class if inanimate. In traditional Dyirbal they would have 
belonged to the balan class. 

Schmidt (1985) also noted a tendency for loss of traditional tribal structure 
and culture to be accompanied by restructuring of kinship terminology and 
classification. Most younger Dyirbal speakers could give names for only more 
basic kinship relations such as brother, wife, husband, etc. Traditional Dyirbal had 
four words corresponding to English uncle: mugqu ‘mother’s elder brother’, qaya 
‘mother’s younger brother’, himu ‘father’s elder brother’, and nquma ‘father’s younger 
brother’. Younger speakers, however, used only the terms gaya or bimu to refer 
to all persons having the relationship of ‘uncle’. 


4.3 Pronominal systems 


Bavin and Shopen (1991: 108) documented a number of age-graded linguistic 
discontinuities in the pronominal systems used by speakers of Warlpiri, spoken 
by ca. 3,000 speakers in various desert communities north and north-west of 
Alice Springs in Australia’s Northern Territory. Although Warpiri is one of the 
strongest of the few remaining Aboriginal languages, with children in some 
communities still acquiring it, the language being transmitted differs to varying 
degrees from the traditional Warlpiri spoken by their elders. Younger speakers 
have simplified and reduced the traditional pronoun system; they are losing the 
inclusive/exclusive contrast along with dual number in pronouns and reducing 
allomorphy, resulting in a reduced inventory of regularized forms. For instance, 
older speakers have three forms for second person n, npa, and nku, while 
younger speakers have only npa. These innovations serve as input for the next 
generation because children past the toddler stage spend a lot of time with other 
children away from adults. Despite acknowledging that the disappearance of the 
inclusive/exclusive contrast and dual number may be motivated by the absence 


330 Suzanne Romaine 


of these categories in English, Bavin and Shopen do not attribute the changes 
in young people’s Warlpiri to language death, or to English influence. Rather, 
they see them as consequences of internally motivated changes toward greater 
semantic transparency and fewer oppositions. 

Derhemi (2006: 42-3) observed considerable attrition in the possessive pronouns 
in a variety of Albanian spoken in an enclave community in Sicily. An 83-year- 
old speaker produced the full paradigm of 28 forms distinguished by gender and 
number, while a 16-year-old speaker produced only 7. Moreover, none of the forms 
produced by the younger speaker was present in the older speaker’s paradigm; 
the younger speaker’s forms were innovations that did not maintain gender 
and number distinctions. These speakers represent two opposite poles of the 
proficiency spectrum. Likewise, Rottet (2001) found a reduction in the pronominal 
and verbal systems of younger speakers with limited competence in Louisiana 
French. While older fluent speakers used three or four variants of the third person 
plural pronoun (‘they’), the youngest groups used only one (eusse). 


4.4 Case marking 


The reduction of allomorphy often leads to case syncretism or complete loss of 
case marking. Dorian (1981), for example, found that the genitive case in East 
Sutherland Gaelic was moribund, with most fluent and nonfluent speakers sub- 
stituting prepositional phrases in its stead. Semi-speakers and younger fluent 
speakers had weakened control over dative marking. Semi-speakers seldom 
used the vocative. These economies in case marking resulted in some nominal 
declensions comprising one invariant form in the speech of all but the older fluent 
speakers. Schmidt (1985) likewise observed reduction and loss of case morpho- 
logy among young Dyirbal speakers, who overgeneralized the use of a single case 
affix or used an English preposition to avoid the need for bound morphemes. 
Morphologically rich languages such as Hungarian with its 17 to 27 cases repre- 
sent particularly interesting cases. Fenyvesi (2005) reported that the case system 
used by Hungarian immigrants in the US had undergone considerable change, 
including loss of suffixes in about half the cases. Immigrant varieties of German 
spoken in the US also frequently instance case syncretism by dint of collapse either 
of the nominative and accusative, or the dative and accusative (Huffines 1989). 
Schmid (2002) found a weakening of genitive case among German-speaking Jews 
who emigrated to England and the US. 


4.5 Phonology 


Researchers have noted a tendency for contraction in phonological systems 
leading to the loss of oppositions. Campbell and Muntzel (1989: 187) hypothesize 
that the marked member of a phonological opposition will be more likely to 
disappear. In El Salvador, for instance, Pipil has lost contrastive vowel length, a 
distinction not found in the dominant language, Spanish, while in Tuxtla Chico 
Mam the merger of /q/ and /k/ in favor of /k/ has likewise eliminated the marked 
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member of the opposition. However, these reductions are also consistent with 
Andersen’s (1982: 92) prediction that subordinate language distinctions not 
matched in the dominant language will be vulnerable to elimination. This phe- 
nomenon is sometimes referred to as negative borrowing or covert interference. 
At the same time, however, Campbell and Muntzel (1989: 189) cite cases where 
marked features have been overgeneralized. Speakers of moribund Teotepeque 
Pipil have overgeneralized voiceless /1/ so that it occurs everywhere and not just 
word-finally. In some varieties of Xinca speakers have overgeneralized consonant 
glottalization. Although Campbell and Muntzel emphasize that these overgener- 
alizations are internal developments tied to imperfect learning rather than 
prompted by Spanish, Woolard (1989: 363) views them in a different light. She 
contends that these particular sounds are overgeneralized precisely because 
Spanish lacks them. Hence the dominant language shapes change in the minor- 
ity language, whether the direction is toward greater similarity with it or greater 
distance from it. 

Another common characteristic of contracting languages is weakening of for- 
merly obligatory rules governing phonological processes. Dorian (1981) analyzed 
the differential failure of lenition of word-initial consonants in East Sutherland 
Gaelic, a phenomenon reported in other Celtic languages (Jones 1998). Campbell 
and Muntzel (1989: 189-90) note the failure of consonant gradation in American 
varieties of Finnish. 


4.6 Syntax 


Many (but not all) of the reported syntactic changes confirm Andersen’s (1982) 
prediction that imperfect speakers will control a smaller number of syntactic con- 
structions than fluent speakers. Schmidt (1985) observed that no one under the 
age of 15 was even able to construct a Dyirbal sentence. Where variant constructions 
exist, less fluent speakers will tend to collapse them into one, as in the case of the 
passive in East Sutherland Gaelic. Younger fluent speakers had a compromise struc- 
ture comprising elements of the two traditional passives formed with different 
verbs, while semi-speakers showed interference from English (Dorian 1981). 
There are many examples of speakers creating a new construction or category 
based on a model available in the other language. Yiddish speakers in the US 
have created a future tense on the model of the English be going to future 
(Rayfield 1970: 69). Compare ge ikh kumen bald (‘I’m going to come soon’) with 
standard German, which uses either the simple present (Ich komme bald) or the 
auxiliary werden ‘to become’ (Ich werde bald kommen). This development is in line 
with Schmid and de Bot’s (2004: 217) observation that attrition is manifested in 
a preference for periphrastic constructions, so that inflected futures get replaced 
by go-futures. Such innovations often pass unnoticed because they incorporate 
no foreign material. In this sense they operate in a similar fashion to a type of 
borrowing called calquing or loan translation often involving idioms and longer 
phrases. For instance, US Spanish tener un buen tiempo (cf. pasarla bien) is a word- 
for-word translation of ‘to have a good time’. 
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A number of researchers have documented a reduction in syntactic complexity 
in dying languages. Hill (1989) points to a number of cases (e.g. East Sutherland 
Gaelic, Dyirbal, Cupenio, and Luisefio) in which the frequency of relative clauses 
has declined. In Pipil once productive passives survive only in frozen verb 
forms (Campbell & Muntzel 1989: 192-3). One of the most obvious effects of 
cross-linguistic influence in syntax is reflected in word order divergence. Second- 
generation German speakers in English-speaking countries like the US and Australia 
tend to overgeneralize SVO word order, replacing German SOV word order required 
in subordinate clauses. American Finnish has become more rigidly SVO along 
with the loss of case morphology, while Finnish has relatively free word order 
(Campbell & Muntzel 1989: 194). Likewise in Dyirbal, the loss of ergative mark- 
ing was accompanied by greater word order rigidity. Dorian’s semi-speakers, how- 
ever, retained VSO word order despite their other syntactic deviations. 


5 Universal versus Specific in Loss and Retention 


The studies reviewed in this chapter have revealed recurrent regularities in 
the attrition process. They provide some, but not unequivocal, support for the 
regression hypothesis, in which language attrition is regarded as a reversal of the 
acquisition process. The hypothesis predicts that in losing a language speakers 
will follow an order opposite to that of acquisition so that the features learned 
first are the last to disappear. The linkage between acquisition and loss was 
articulated by Jakobson (1941/1971) in his hypothesis of irreversible solidarity, 
which predicted that the dissolution of the sound system of aphasics was an 
exact mirror image of children’s phonological development. Jakobson’s hypoth- 
esis followed from his proposal that there were universal principles governing 
the organization of phonological systems manifested synchronically, diachronic- 
ally, and ontogenetically. Thus, for example, the presence of fricatives in a lan- 
guage implies the presence of stops because typologically there are no languages 
without stops, although there are some without fricatives. Hence, the acquisition 
of fricatives presupposes the acquisition of stops. 

Although the regression hypothesis has fallen out of favor among scholars of 
pathological language loss, some researchers have tested it as a predictor of out- 
comes in systems undergoing attrition as a result of contact and/or disuse, and 
in doing so, have extended the idea of “last acquired, first lost” (or “first in, last 
out”) to morphology and syntax. Schmid (2002), for example, found that plural 
markers in German under attrition are subject to interference from English, and 
are acquired last, but case and gender markings, which are acquired earlier, remain 
more stable. Among English speakers who had learned Japanese or Chinese as a 
second language, the most frequent classifiers were acquired earliest and tended 
to be most resistant to loss (Hansen & Chen 2002). Although a few studies such 
as these from immigrant bilingualism or second language acquisition report 
findings broadly supportive of regression, more data are needed from both 
acquisition and attrition. Because a dying language by definition is not being actively 
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transmitted, there are usually no available acquisition data for comparison. 
Another methodological complication derives from the fact that we still have no 
measure of ordered progression comparable to yardsticks such as MLU (mean 
length of utterance) widely used by child language researchers to assess milestones 
in development (see also Dorian 1989: 1; Romaine 1989; Schmid & de Bot 2004: 
226, 228). 

Although not all changes in dying languages involve loss and simplification, 
much of the preceding discussion (like the research on which it is based) has con- 
centrated on what is lost (usually without compensation) rather than retained. 
The reductions in young people’s Dyirbal are not compensated for by structural 
expansions elsewhere in the system, as they might be in other languages under- 
going normal change. When Dyirbal speakers lose traditional kinship terms, they 
do not replace them. They rely instead on English words to fill the gaps, but the 
English words do so only very imperfectly. In some cases there is nothing equiva- 
lent in English that could replace what is lost. Although noun classification 
in young people’s Dyirbal is still partially maintained, and is different from 
English, it is at the same time less rich in its semantic associations than the tra- 
ditional system. 

Nevertheless, examples of remarkable retention cry out for explanation. Hamp 
(1989: 198) found an elderly speaker who preserved the one distinctive central 
Arvanitika vowel in an otherwise thoroughly hellenized phonology and phonetics. 
Indeed, some speakers preserved some non-Greek features to the end. Likewise, 
Hamp (1989: 206-7) contends that the phonologies of the last few remaining 
speakers of the Scottish Gaelic dialects of Kintyre and Muasdale preserved 
archaic features and distinctions. Hamp’s findings for these two dialects, how- 
ever, contrast with those of Dorian for East Sutherland Gaelic. Mithun (1989: 257) 
found differences between two separated Cayuga communities in Oklahoma and 
Ontario. In Ontario adults used the language daily in conversation and ceremonies. 
Yet, strikingly, it was the Oklahoma community, where Cayuga was used by only 
a few individuals (and only rarely), who had nearly complete retention of “an 
amazingly complex morphological and phonological system.” 

Thomason (2001: 236) observed no mixing between English and Salish in the 
speech of the small number of remaining elderly speakers of Montana Salish 
spoken on the Flathead Reservation. Although Salish is severely endangered with 
fewer than 60 remaining fluent speakers, only a handful of English words have 
been borrowed after 150 years of contact. Despite the loss of some Salish stylistic 
resources, the language has retained its syntactic structure, semantics, morpho- 
logy, and phonology. Such cases indicate that reduction in use does not neces- 
sarily lead to structural reduction; nor does intensive contact always involve 
extensive borrowing. Nor do proficiency continua of the type found by Dorian 
and others always come into being. Hill (1989), for example, observed of Luisefio 
and Cupefio Indians in California that they spoke their native language either 
“fairly well or not at all.” 

Because acquisition of the dominant language proceeds in tandem with the loss 
of the minority language, distinctive features of a receding language may also be 
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transferred to and survive in an equally distinctive form of the dominant language 
replacing it. The Highland variety of English to which terminal Scottish Gaelic 
speakers shift incorporates a large number of the most distinctive phonetic traits 
of their Gaelic. Hence, the phonology of terminal Gaelic is protected and preserved 
by Highland English. Some aspects of Aboriginal identity and language live on 
in the local and highly distinctive (though stigmatized) variety of English spoken 
among young people now. Although most of the lexical and morphological items 
in Kriol (an English-lexicon creole), for instance, are derived from English, some 
terms for flora and fauna make their way into Kriol. Kriol morphosyntactic 
categories and ways of speaking are also distinctly Aboriginal. Like Aboriginal 
languages (but unlike English), Kriol distinguishes singular, dual, and plural as 
well as inclusive versus exclusive pronouns. Nevertheless, not all grammatical 
categories found in Aboriginal languages have reflexes in Kriol (Pensalfini 2004: 
148). The noun classification systems of Aboriginal languages such as Dyirbal encode 
culturally significant groupings of concepts with no equivalents in English or Kriol. 

Langlois (2002: 128), however, concluded that English influence has not dimin- 
ished the complexity of the traditional Pitjantjatjara kinship system in the 
Areyonga community, where a majority of people still speak Pitjantjatjara (or 
its neighboring dialect Yankunytjatjara). In traditional Pitjantjatjara a person’s 
mother along with mother’s sisters are referred to as ngunytju and a person’s 
father along with father’s brothers, as mama. Although teenagers are replacing the 
traditional terms ngunytju with English maama, and mama with English fatha, the 
borrowed terms still retain their semantic reference to traditional kin categories. 
Nevertheless, some confusion has resulted from the phonetic similarity between 
English mama and Pitjantjatjara maama because the difference in vowel length 
between the two terms has not always sufficed to distinguish them. Teenagers 
also often used the English term aanti (‘aunty’) to refer to “father’s sisters” and 
angkala (‘uncle’) for “mother’s brothers.” Other areas of kinship terminology 
have undergone more extensive change involving not just the introduction of new 
terms from English, but also distinctions not existent in traditional Pitjantjatjara. 
Traditional Pitjantjatjara differentiates between older brother (kuta) and older sis- 
ter (kanguru) including children of parent's siblings, but not between younger brother 
and sister. Langlois (2002: 124) observed teenagers using the English terms sista 
and bratha to make a distinction between “younger sister” and bratha “younger 
brother.” Here the English borrowings are used to introduce a distinction not 
matched in traditional Pitjantjatjara, thereby enriching rather than collapsing 
semantic distinctions. Pensalfini (1999) also found rising use of case markers as 
indicators of pragmatic prominence in Jingulu. 

These two cases show that innovation may still take place in an otherwise 
moribund language in an advanced state of attrition. Moreover, not all changes 
in dying languages make them more similar to the dominant language to which 
their speakers are shifting. Some losses are compensated for, as in the substitu- 
tion of analytical marking of case relations in East Sutherland Gaelic and Dyirbal. 
Comparative studies such as those of Fenyvesi (2005) and Silva-Corvalan (1995) 
examining the fate of one language in contact with a variety of others allow some 
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purchase on the issue of what is universal versus specific about change in con- 
tact settings involving shift and death at the same time as they underline the 
significance of the social context. Social factors affecting the attrition process include 
age of emigration, extent of use of the language after emigration, language 
loyalty, religion, etc. Dutch Catholics in the US, for instance, switched to English 
more rapidly than did Protestants. Among the latter there are still some third- 
and fourth-generation immigrants who speak Dutch fluently, while in Australia, 
Dutch immigrants are among the first to shift to English (Clyne 2003). 

Some of the local varieties of German brought to the US by Anabaptist immi- 
grants such as the Amish, Hutterites, Mennonites, and others have survived along- 
side English for nearly 400 years. Degree of religious conservatism corresponds 
with extent of competence in German. Among the most conservative groups like 
the Old Order Mennonites a strict and stable situation of diglossia with bilingualism 
exists with no mixing of English and German. Among other less conservative groups 
and particularly among nonsectarians, as soon as English intrudes into what 
were German domains, shift to English is swift and complete. Nevertheless, the 
domain separation observed in sectarian communities has not protected their 
German from convergence; the speech of the nonsectarians, whose children 
speak only English, shows less convergence toward English (Huffines 1989: 225). 
Although some have suggested that convergence may be a survival strategy, 
Woolard (1989: 360) explains the seemingly paradoxical association between 
high structural interference and high retention by reference to the nature of 
the normative pressures exerted by different kinds of social networks. Networks 
requiring speakers (even those whose skills are imperfect) to use the declining 
language rather than switch to the dominant language will lead to the intro- 
duction of more innovative and deviant forms. By contrast, networks open to 
code-switching conserve the integrity of the minority language by allowing less 
proficient speakers to use the dominant language to fill lexical and other deficien- 
cies. However, it is far too soon to conclude, as Woolard (1989: 365) does, that 
there may be a causal correlation between language maintenance and linguistic 
convergence. 


6 Conclusions 


This chapter has examined language death as one of a number of possible out- 
comes of language contact. It has identified some of the changes typically accom- 
panying language shift culminating in death such as loss of native vocabulary 
often along with extensive lexical borrowing from the dominant language, reduc- 
tion of allomorphy, loss of phonological distinctions, analogical leveling, loss of 
grammatical categories, greater use of analytical instead of synthetic constructions, 
and stylistic shrinkage. Nevertheless, these generalizations all have exceptions, 
leaving scholars unable to predict what will happen in any particular case of 
language contact or, indeed, when or whether change will occur at all. 
Unfortunately, this applies to both contact-induced change as well as to internally 
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motivated change. Although intense language contact is a prerequisite for 
extensive structural interference, it does not always lead to change. Thomason 
(2001: 61) refers to attitudes as the “wild card” making contact-induced change 
essentially unpredictable. Attempts to formulate linguistic constraints on contact 
phenomena have failed because both the direction and extent of linguistic inter- 
ference is socially determined. 

Moreover, change is often the result of multiple factors, making the distinction 
between internally and externally motivated change difficult to disentangle. As 
Burridge (2006) points out, all the syntactic changes underway in Pennsylvania 
German potentially attributable to English influence (such as change in word order, 
case syncretism, creation of a get-passive, progressive and go-future) might have 
happened anyway or simply been internal developments already underway that 
have been accelerated by contact. The innovation of a go-future in Yiddish and 
Pennsylvania German is interesting in light of Heine and Kuteva’s (2005: 103) obser- 
vation that “among all grammatical categories it is future tense that appears to 
be the most likely to be replicated in situations of language contact.” Although 
the construction may be modeled on English, semantically similar verbs tend to 
follow similar grammaticalization paths; go-futures have frequently arisen in 
many languages where there is no reason to suspect contact. The case syncretism 
typical of immigrant varieties of German is also found in some native varieties, 
where there are no grounds for invoking English influence or language contact 
as the driving mechanism (Huffines 1989: 212). Maandi (1989) too argues that case 
markers in Estonian are collapsing even where the language is not in contact with 
Swedish. Likewise, morphological simplification has been underway in Irish 
since the tenth century, long before the invasion of English speakers. Indeed, the 
same can be said for virtually all of the structural changes discussed here; they 
are not specific to language death and can occur in “healthy” languages as well. 
Dorian (1981: 151) has stressed that there is nothing unusual about the types of 
changes occurring in dying languages (though the amount and rate of change may 
be atypical, as also may be the degree and type of variability). Here too the lack 
of a meaningful measure against which to assess change in contracting languages 
may prevent progress in understanding whether change is constrained or facili- 
tated in different ways in response to degree of language vitality. 
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17 Fieldwork in Contact 
Situations 


CLAIRE BOWERN 


1 Introduction 


Few communities are wholly linguistically homogeneous and completely isolated 
from their neighbors. There is great variety in the extent of language and dialect 
contact in different communities, and therefore consideration of language contact 
is important in fieldwork and language documentation. However, fieldwork 
focused on individual speech varieties (or languages) has tended to see mullti- 
lingualism as a problem rather than an opportunity, as a source of contamina- 
tion of the data under consideration rather than something to study in its own 
right (see, for example, Vaux & Cooper 1999: 8; and the argument in Aikhenvald, 
to appear). Such a view is understandable in the context of the wish to describe 
a single standard linguistic variety and to produce materials which are representative 
of the language as a whole. Trying to get an accurate picture of language contact 
in a community may also make the research project considerably more complex; 
conversely, excluding contact means excluding potentially relevant data. 

While there are numerous ways in which aspects of language contact impinge 
on fieldwork and language description, in this chapter I concentrate on three 
different ways in which language contact and multilingualism are relevant to 
fieldwork (and vice versa). The first is the question of what to study when a 
linguist goes to the field. In section 2 I consider definitions of contact, typical 
contact situations, and ways of discovering if the the field site involves language 
contact. The second aspect of fieldwork and contact, in section 3, concerns what 
is studied in a language contact situation. The third, covered in section 4, involves 
the effects that linguistically diverse speech communities have on fieldwork. I take 
examples from both extreme contact, such as community-wide multilingualism, 
and less extreme examples. Finally, fieldwork in a language contact situation requires 
certain methods and techniques. These include (but are not limited to) data man- 
agement techniques, stimulus materials, and coding of information. 

Throughout this chapter, I stress the need for the linguist to have an under- 
standing and knowledge of the social situations at work in the community under 
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study, such as demographics and history. That is simply because any linguistic 
claim about language contact reduces to a claim about social behavior of speakers. 


2 Defining and Diagnosing Language Contact 
in the Field 


I take a broad definition of the term “fieldwork.” I assume that fieldwork is 
any type of linguistic data gathering where the linguist uses information from 
a pool of speakers interacting with each other in their usual environment. This 
definition includes immigrant communities (for example, the Vietnamese com- 
munity in Houston) but does not include field methods classes as fieldwork, nor 
working with a single speaker in a university setting (see Hyman 2001; Bowern 
2008). The definition of “field” and “fieldwork” becomes important when con- 
sidering language contact, since it is only in a situation where the linguist may 
observe the different factors which make up language contact that progress can 
be made. 


2.1 Defining contact and fieldwork 


At the most basic level, all linguistic interaction is “language contact,” albeit between 
extremely similar grammars. That is, speakers are exposed to many varieties of 
their language which differ in small ways from their own grammar. However, 
we use the term language contact to refer to situations where groups of people 
who speak very similar varieties are in contact with people who speak rather 
different varieties (cf. Thomason & Kaufman 1988; Thomason 2001: 2). That is, 
there is more than one speech variety in use.’ Fieldwork in such an area simply 
involves going to an area where contact occurs, be it in a large city or a small 
remote community. That is, I do not ascribe to the “Indiana Jones” fieldwork model 
(Bowern 2008: 13-14), where a linguist has to go to a remote (and preferably 
dangerous) part of the world in order to have the work count as “fieldwork” (see 
also Hyman 2001), but equally studying a single speaker outside their regular pat- 
terns of language use will not allow much insight into those patterns. Therefore 
I am assuming a situation where the linguist has traveled to a community of 
speakers, not one where a single individual is being studied outside of their 
regular social networks. 

Language contact is not, of course, a homogeneous phenomenon. Contact may 
occur between languages which are genetically related or unrelated, speakers may 
have similar or vastly different social structures, and patterns of multilingualism 
may also vary greatly. In some cases the entire community speaks more than one 
variety, while in other cases only a subset of the population is multilingual. 
Lingualism and lectalism’ may vary by age, by ethnicity, by gender, by social class, 
by education level, or by one or more of a number of other factors. In some 
communities, there are few constraints on the situations in which more than one 
language can be used, while in others there is heavy diglossia, and each language 
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is confined to a particular type of social interaction. An often-quoted example is 
the situation of various varieties of Arabic, or the status of French in West Africa 
or Swahili in Tanzania. In some parts of Indigenous Australia, in contrast, it is 
not uncommon for several languages to be in use at once, both through code- 
switching and by different speakers speaking different languages (Evans 2001; 
Heath 1978). 

There may also be different outcomes of language contact, ranging from lim- 
ited lexical borrowing to pervasive metatypy (for the term, see Ross 1996; 1997). 
Thomason and Kaufman (1988: 50, 74-5) diagram various possible outcomes of 
contact, categorized according to degree and intensity of contact, along with other 
factors. Thomason (2001: 129-53) gives seven ways in which language contact 
may lead to language change. There are also discussions of language contact which 
play down the role of contact-induced change in language history, for example 
by arguing (e.g. King 2000) that syntactic effects of language contact are caused 
by lexical borrowing (that is, the borrowing of syntax along with lexical items). 
Some discussion of these points of view can be found in Bowern (2009) and Sankoff 
(2001). A fieldworker going to a language contact area should be familiar with 
this literature, and will probably be able to add to it! 

While there are a great number of different language contact situations, a few 
come up frequently in areas where linguists do fieldwork. One is dialect contact, 
for example between standard varieties of a language and regional varieties 
(e.g. in France or the Arab world). Such situations can exhibit strong patterns of 
diglossia. Linguists need to be aware of this because the situation in which they 
work tends to favor production of standard forms rather than nonstandard ones. 
For example, fieldworkers working on Arabic frequently report difficulties in estab- 
lishing work patterns where local (rather than standard) forms are elicited. Abbi 
(2001: ch. 7) mentions similar effects in parts of India.’ The degree of difference 
between the various varieties makes the diagnosis of contact more or less easy, 
but it does not seem to be the case that degree of relatedness has an effect on 
degree of contact (Thomason & Kaufman 1988). Rather, the degree of contact 
and transfer depends on characteristics of social interaction and the type of social 
network, rather than the degree of relatedness between the varieties. 

Trade or work languages are also often found, where the minority language is 
spoken within the community or at home and one or more sectors of the popu- 
lation conduct business in another language. In parts of southern Namibia, for 
example, Khoekhoegowab (also known as Nama/Damara) is the first language 
of many, but business, shopping, and so on are usually conducted in Afrikaans 
(and increasingly in English), and schools are English-only after grade 4 (see 
further Maho 1998). 

A further type of language contact involves exogamous communities where more 
than one language might be used within the community because its members come 
from different areas: see, for example, Aikhenvald (2004) for Amazonia, Stanford 
(2006) for Sui, and Heath (1978) for Arnhem land, all areas where linguistic exogamy 
is practised. In northern Australia multilingual communities like Roper River, 
Milingimbi, Maningrida, and Wadeye (Port Keats) arose from the settlement of 
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various groups on missions, cattle stations, and government depots. In some 
parts of the world, populations of long-term refugee camps have created their 
own lingua francas. The converse of such communities where exogamy leads to 
multilingualism is an endoterogenous community which maintains its own lan- 
guage for the purposes of excluding outsiders. Angloromani and Media Lengua 
(Winford 2003; Muysken 1997) are examples. In this case, extensive contact and 
transfer from majority languages may occur because the in-group language is learnt 
by adult second language learners. 

Finally, fieldworkers particularly often work in endangered language commu- 
nities where language shift is in progress. Such communities may exhibit “last 
speaker” effects of various types. For example, Thurgood (2003) reports the rapid 
restructuring of causative marking in the last generation of speakers of Anong. 
A similar phenomenon is often called “young people’s varieties” and is quite 
common among the indigenous languages of Australia, where they are still being 
learnt by children (Schmidt 1985; Lee 1987; Langlois 2004). In section 4.4 below I 
outline some of the ways in which these varieties may affect fieldwork. See also 
Romaine (this volume). 


2.2 Symmetric and asymmetric multilingualism 


Another potential point of variation concerns the patterns of multilingualism’ within 
the community. Put simply, does everyone speak all the varieties that are used 
in the community? If not, who speaks what? What are the conditions under which 
each language will be learned and used? 

Truly symmetric multilingualism is sometimes argued to be quite rare. That is, 
the argument goes that in cases where the whole community speaks more than 
one language, the multilingualism is redundant and that some point it becomes 
unstable and language shift occurs. These arguments are based on immigrant com- 
munities and heritage language speakers in countries such as the USA, where we 
see a three-generation pattern of shift from bilingual speakers to monolingual major- 
ity language speakers (see, for example, Fishman 1991). However, there are other 
situations where community-wide multilingualism has, at least historically, 
proven to be stable over a longer period. One is the case of standard language 
diglossia versus regional varieties, such as Italy or Germany. Another is the case 
of small exogamous communities where there is a social expectation of multi- 
lingualism (see Aikhenvald 2004 for examples from Amazonia; and Heath 1978 
for a discussion of an Australian case). 


2.3 Endangered (indigenous) language communities 


One very common type of language contact situation is where a minority lan- 
guage, and often an endangered language, is spoken within a larger community. 
This language may or may not be the majority language within a specific speech 
community. For example, Khoekhoegowab is a minority language within 
Namibia, but it is the majority language in a number of villages in the south 
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of the country. Other languages may be minority languages even within their 
own communities, such as Yan-nhanu at Milingimbi, where there are about 
15 speakers and 100 community members out of a total community of about 800. 
Of course, not all indigenous languages are endangered, and not all endangered 
languages are spoken by Indigenous people. 

You will need to work out who speaks what language, and whether they speak 
it as a first language or a second language (or a later language). It is useful to 
find out how exactly speakers learn such languages. It could be they picked them 
up as they grew up, with both languages being used in particular circumstances. 
Or they might only have learnt another language at school. 

I have argued elsewhere (Bowern 2008) that indigenous languages do not 
require special treatment in fieldwork; however endangered language commu- 
nities and communities where language shift is in progress do require particular 
skills from the fieldworker. Some of these are relevant to language contact; others 
relate to the causes of language loss and shift more generally, such as poverty 
and community fracture. For more information on endangered languages, see 
Tsunoda (2004) and Grenoble and Whaley (1998). 


2.4 Community/heritage languages 


Another language contact situation concerns community languages, also known 
as heritage languages. These are languages of immigrant community groups, such 
as German or Finnish in the USA, Bengali or Balochi in the UK, and Moroccan 
Arabic in France (compare Clyne 1991; 2003). I mention heritage language groups 
as distinct from indigenous groups because they are often treated differently in 
public policy (for example, in the English-only movements in the USA), and because 
there is often a standard language spoken outside the immigrant community which 
may serve as a point of comparison. Speakers may also look to that as a standard 
and may or may not consider their own variety to be different from it. 

The dynamics of language rhetoric and shift are also distinct. For example, 
Indigenous groups are often told to “modernize” their cultures by switching to 
the dominant language, whereas immigrant groups are usually told to “assimi- 
late.” Heritage language groups may have rather different views of their own 
variety vis-a-vis the standard. If they consider their own variety to be different, 
it may be that they consider it a “corrupted” variety of the language (and will 
want to teach you the standard). Conversely, if the population has been isolated 
from their country of origin for some time, they may consider their own variety 
to be more “pure” or “archaic.” Alternatively, there might be no perception that 
the variety is different from the standard. 


2.5 Is your field site a language contact situation? 


Unless you’re doing monolingual elicitation (see e.g. Everett 2001), some language 
“contact” is involved. Doing translation or fieldwork through another language 
will have an effect on your data. If you rely entirely on elicitation, you will bias 
the attested structures to ones that have easy translation equivalents in the 
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language, and you'll probably only get the structures you think to ask about. 
Conversely, using only conversation will net you the most frequent structures but 
is unlikely to give you various other topics. For further discussion, see Bowern 
(2008: 115ff.). 

However, I assume you are aware of the effects of the effects of a contact lan- 
guage in doing fieldwork, and we are talking about language contact which is 
not caused by the methodology of the linguist. Usually, it will be fairly obvious 
whether language contact is currently part of your field site, although it will prob- 
ably not be clear what type of contact is occurring, who is participating, and to 
what extent.’ From the very beginning of your fieldwork, you should be on the 
lookout for differences between speakers. There might be generational differences, 
gender differences, or other differences (although of course there may be reasons 
other than language contact for such differences). You should also ask about 
language attitudes and about the language situation in the community. You 
may be able to get information about the demographics of the site and the wider 
area, which in turn may provide information about likely conduits of language 
contact. Don’t underestimate the utility of incidental observation as a source of 
topics for more detailed investigation by other methods. 

Be aware however that just because an area involves because of more than one 
language it does not mean that there is necessarily a great deal of language con- 
tact. For example, Houston (a city of about 5 million people) has a very diverse 
population, with speakers of many different languages. We might therefore 
assume that it is a good place to study language contact. Further study would 
show, however, that while Houston is diverse overall, individual neighborhoods 
tend to be very homogeneous, and so language contact is not nearly as great as 
one might suspect if one looked at the figures for the metropolitan area as a whole. 


3 What to Study 


There are many possibilities for studying language contact and its results in 
different communities. We know that language contact can affect all levels of 
language (Thomason & Kaufman 1988), and any of these areas may be usefully 
studied. 

The language contact may be an incidental part of the fieldwork. That is, you 
may be studying a language which also happens to be spoken in a language 
contact situation. Some of the issues that fieldworkers in such a situation should 
be aware of are listed in section 3.1 below. Alternatively, you could study the 
linguistic results of language contact. That also brings up a particular set of issues, 
some of which are discussed in section 4. 


3.1 Working on a single language in a contact 
community 


Traditionally, linguists have avoided working in multilingual areas if there is 
an option to work in a more homogeneous community. Handbooks often suggest 
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trying to work with monolingual consultants were possible, to avoid the possi- 
bility of getting “contaminated” data (Kibrik 1977; Vaux & Cooper 1999: 8; Abbi 
2001). However, if the aim of fieldwork is to produce an accurate documentation 
of a linguistic variety as it is spoken by a group of speakers, language contact 
phenomena are a part of that and should be documented as well. To ignore code- 
switching, borrowing, and other contact phenomena is to produce an artificial 
description of that language. 

It is, of course, possible to work on a single variety. However you are likely to 
miss things (in all areas of the study), if you are unaware of the wider situation. 
If the contact has changed in the period between previous work on the language, 
you may find differences between your data and earlier sources, which you might 
otherwise wish to ascribe to errors made by one or other of the linguists. You 
should compare your data with data outside the immediate community; this will 
allow you to gauge the effect of language contact in the area. For example, a study 
of the differences between the variety of Greek recorded in Dawkins (1916) 
and Greek spoken in Greece would show that the Asia Minor Greek varieties 
described by Dawkins vary in systematic ways from those spoken in mainland 
Greece. However, be wary of attributing all differences to language contact; it is 
tempting to see contact as the cause of all differences. In practice, cause can be 
very difficult to ascribe with certainty. For example, it is obvious that the syntax 
of Texas German’ differs in systematic ways from Standard High German, and 
that many of the features which distinguish Texas German from Standard 
German are ones which Texas German shares with English. However, given 
that Standard German was only one of the inputs to Texas German (along with 
a number of regional dialects which differed markedly from Standard German), 
Standard German did not change into Texas German. For further discussion of 
the interplay of borrowing, drift, and change, see Jones and Esch (2002). For more 
information on Texas German, see Boas (2003) and Salmons and Lucht (2006). 


3.2 Working on the contact situation 


Classic and more recent studies of language contact areas and their linguistic 
results include Dawkins (1916) for the Greek spoken in (Turkish-dominant) Asia 
Minor, Gumperz and Wilson (1971) for Kupwar village and the convergence 
of three languages from different families, Aikhenvald (2004) for the language 
contact situation in Amazonia, Nurse (2000) for Daiso (an area between Kenya 
and Tanzania), Heath (1978) for Arnhem Land, Ho and Platt (1993) for Singapore 
English, and many others. All of these studies were based on fieldwork, using 
various techniques. The data for Nurse (2000) began as a linguistic survey, while 
Aikhenvald (2004) is based on detailed work, initially with a single community 
in the region. 

One way to approach such a study would be to take an anthropological lin- 
guistic perspective of the contact area and define the contact interaction itself as 
the focus of study. Working on the contact situation and its linguistic results can 
be framed in terms of a number of questions. What languages are used in the 
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community? Who speaks what language? How well do they speak those languages? 
Who do they talk to, and what language do they use when they talk to them? 
That is, when are the different languages used? When people communicate, how 
do they utilize each of these languages? Do they code-switch? What led to this 
situation? Why are there multiple languages in use in the community? How do 
community members acquire the different varieties in use in the community? Does 
everyone have the same degree of multilingualism, or is it asymmetric? What 
factors govern who is multilingual? Do all sections of the community participate 
in the multilingualism, or just some? 

Working on a complex contact area requires preparation and it may take some 
time to begin to unravel the threads of the area. It might be necessary to have 
competence in several unrelated languages before seriously starting work. It 
might be necessary to work with a diverse cross section of the community in 
order to find the relevant variables to focus on. Such field sites should not be con- 
sidered short-term “quick studies.” A superficial familiarity with the community 
is bound to result in bad generalizations. 

Aikhenvald (to appear: 1) presents the study of language contact in terms of 
the description of the multilingual competence of a single speaker. That is, she 
frames the discussion in terms of describing the social and linguistic competence 
of a speaker with multiple grammars at their disposal. It is, of course, possible 
to take the monolingual linguistic competence model and adapt it to a multi- 
lingual person. There is some work of this type, including Halmari (1997). 
However, most studies of contact take either a community-oriented view or a 
historical one, and discuss the situation not in terms of competing (or comple- 
mentary) grammars within individual speakers but as a set of community 
resources which influence each other. 


3.3. Working on a contact variety 


A further approach to language contact is the study of a single contact variety, 
such as a pidgin, creole, or mixed language (for the terms see Holm, this volume). 
I would argue that from the fieldwork perspective, work on such languages is no 
different from that described in section 3.1, that is, working on a single language 
in a contact situation. Contact varieties can (and should, I would argue) be 
approached with exactly the same tools that are used for any other type of lin- 
guistic fieldwork. 

Resist the temptation to describe the language in terms of the other languages 
that went into its genesis. That is, it is very tempting to describe a contact 
variety such as Mednij Island Aleut in terms of Russian and Aleut (the two lan- 
guages which contributed to the variety), or to describe Young People’s Dyirbal 
in terms of the differences between it and Traditional Dyirbal (Schmidt 1985). 
However, in many cases, there will be features of the new variety which are not 
present in either of the contributing varieties. After all, Modern Bardi has con- 
siderably more features of polysynthesis than the Bardi of the 1920s, but Bardi 
has not acquired those features from English even though the primary language 
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contact in that community has been with English. Moreover, just because a 
contact variety arose, it does not imply necessarily that the contact situation is 
the same today. Kriol is a contact variety which arose in northern Australia about 
a hundred years ago, and most of the languages spoken by the community at 
Roper River who gave genesis to the creole are no longer spoken. These days, 
the main contact is between Kriol and English, not between Kriol and indigenous 
languages (Harris 2004; 2007). 


4 Linguistic and Paralinguistic Effects of 
Language Contact Relating to Fieldwork 


4.1 Identifying cause and effect 


Identifying cause and effect within data is known to be difficult, not only in lin- 
guistics. This is also true in establishing that particular shared similarities are 
due to language contact. After all, any given shared feature may not be due to 
language contact between those two varieties, but rather to retention of shared 
features (if the languages are related), borrowing from a third language, calquing, 
or chance. Frequently, in language contact areas, any feature which may be 
attributed to contact is so attributed. This in turn implies a particular view of 
language change and of language use in a community. Establishing the causes of 
change and the sources of particular constructions in contact areas is extremely 
difficult, and plausible analyses may be found which could cover several differ- 
ent situations. Therefore it is very important to think about the evidence for a 
claim of contact-induced language change. 


4.2. Variation in data 


Multilingual communities can exhibit a great degree of variation, both within 
the data from a single speaker and throughout the community as a whole. This 
can trip up the unsuspecting fieldworker who expects to be describing a single 
cohesive variety. (Of course, there is variation in all language, but the variation 
in complex contact areas with multilingual speakers is most clearly in evidence, 
since speakers can draw on multiple speech varieties for different purposes and 
the fieldworker does not have access to those nuances.) 

A linguistic variable is any linguistic item which has different realizations. The 
different realizations may be conditioned by any number of factors, including 
the age of the speaker (speakers over 50 may have a preference for one pronun- 
ciation of an item, whereas school-age children may systematically use another 
realization of that item), their social class, race or ethnicity, gender, level of 
education, or degree of familiarity with other languages. That is, contact-induced 
variation cross-cuts the other types of variation which are found in communities, 
no matter whether language contact is present. Variation may be stable (such as 
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[aks] ~ [ask] for English ask) or it might be indicative of a linguistic change in 
progress. 


4.3. The language under study 


In areas where extensive code-switching is normal, it might be extremely un- 
natural for someone to talk in a single language. This makes it very difficult for 
them to produce extended chunks of speech without code-switching. It leads to 
much greater self-monitoring and may interfere greatly with grammaticality 
judgments. That is, the speaker is so concentrated on the production of speech in 
a single language that they produce sentences which are stilted and unnatural. 
The best way to deal with a situation like this is not to worry too much about 
the code-switching, but to check the language later with another speaker. 
Speakers will often correct code-switches in such contexts. This gives you valu- 
able data not only about what forms speakers feel belong to one language or another, 
but also what the equivalent nonswitched forms are. 

Similar issues may occur if the language is not used regularly, for example 
if the language is highly endangered and most of the population has already 
shifted to another language. If the person is not used to producing sentences 
in that language, you may get answers which include words and syntax from 
several languages. Some discussion of working with semi-speakers can be found 
in Bowern (2008: 137-9), including how to encourage people who may not have 
spoken the language for a long time. 

Speakers may be unable to tell in some situations what forms belong to which 
language. Although we learn from introductory linguistics onward that speakers 
intuitively know what is well formed in their language and what is not, in prac- 
tice, speakers cannot always assign the correct form to the correct language. 
Judgments about which word belongs to a particular language may vary con- 
siderably. For example, if you ask a speaker of English if ['d3zanta] ‘junta’ is 
an English word, there are several possible responses. On the one hand, it is a 
loan from Spanish, and therefore not an English word in the sense that stone is an 
English word. On the other hand, junta is not pronounced ['d3anta] in Spanish; 
therefore it’s not the same as the Spanish word with the same orthography. I 
have witnessed this problem frequently in my own fieldwork in eastern Arnhem 
land, especially in the context of extensive elicitation and direct questioning. 
Occasionally in fieldwork involving lexical elicitation I have had speakers pro- 
duce the target word in English, with the phonology of the target language. That 
is, I have asked for the word sort and received the answer [sort"]. 

In extreme cases, you may get data in the “wrong” language completely. For 
example, McDonald and Wurm (1979) is a grammar of the Wangkumara language 
and was documented from a single speaker. However, the speaker called the 
language Garlali, which is the name of the language spoken to the east of 
Wangkumara, and in fact, the authors were under the impression (as far as proof 
stage of the grammar) that the language they were documenting was Garlali. 
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4.4 Which language to record, which variety to 
document 


Language communities with extensive variation present questions for documen- 
tation. In some types of language contact community, it is common for young 
people to have a very different way of speaking from their parents and grand- 
parents. These “young people’s varieties,” as mentioned above, are quite com- 
mon in Australia (cf. Schmidt 1985; Lee 1987; O’Shannessy 2005; among others). 
Of course, both the older more traditional way of speaking and the young 
people’s speech are extremely interesting and could be the target of a documen- 
tation project. However, the variety you choose to do fieldwork on has con- 
sequences for the project. The community may feel that the young people do not 
speak properly, and that if you use their variety as the basis for the documenta- 
tion you will not be recording the right version of the language and you risk 
alienating some of the most powerful people in the community. On the other hand, 
if you pick the more prestigious variety and then base language learning or other 
documentation materials on it, you are likely to alienate the target audience for 
the materials. In such cases, let the community guide you, although of course this 
supposes that “the community” is in agreement, which is unlikely to be true. 

Some thought should be put into documentation materials in language contact 
areas. Linguists tend to talk about “giving back” to the community (e.g. Crowley 
2007: 34; Bowern 2008: ch. 14) and frequently this takes the form of learner’s guides, 
school project materials, oral history books, language software, or other materials 
designed to promote or showcase the language. However well-meaning such 
proposals may be, in some areas they might be greeted as inflammatory or 
patronizing. Consider the response that writing street signs in African American 
English would have in Harlem, for example, especially if it were promoted by 
university academics. 


4.5 Language attitudes 


Even if the primary object of study is a single variety in a contact situation and 
not the language contact itself, language attitudes will likely have a great effect 
on both the data and the circumstances of the fieldwork. 

First, if you are working with people who are fluent in both the local regional 
language and the standard language, you may only get data from the standard 
as the default response. This is especially true if you are a native speaker of either 
the standard variety or speak a different variety of the language you are doing 
fieldwork on. Because it is usual in such circumstances for people to speak the 
standard language, it may be quite difficult to elicit anything else. Furthermore, 
certain types of linguistic techniques tend to promote particular linguistic inter- 
actions (and it is necessary to be aware of them even if they are not what you are 
studying). Elicitation is a type of formal or semi-formal interview, and formal 
contexts tend to elicit particular varieties in areas of high diglossia. Vaux and Cooper 
(1999) have some suggestions for encouraging discussion of low-prestige varieties. 
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Recruiting local research participants to do interviews will also minimize this 
problem. 

The race, ethnicity, and/or gender of the participants in the conversation may 
affect the data. It might be hard to overcome tendencies to talk in one language 
to someone from that group. There is a reported issue in language documenta- 
tion that elders who are used to not talking their language in front of younger 
community members find it difficult to break those habits. I once had a conver- 
sation with someone where | spoke his language and he spoke English. At one 
point he apologized that he had never spoken his language in front of a white 
person before and found it impossible to do so. 

Ina highly multilingual community, it may be that the medium you work through 
as a field language will harm your ability to get data in your target language. 
This is especially the case where there is a default language which overrules the 
use of other languages or varieties. For example, if there is a standard language 
and you are trying to study a local variety. There are numerous cases of field- 
workers reporting that they had to deny knowledge of major languages (which 
they in fact spoke fluently) in order to have any access to the smaller languages. 

Aspects of language politics and language purity may also come into play. In 
section 4.3 I mentioned the case of a man who gave linguists a different language 
from the one they thought they were working on because of a personal prefer- 
ence for speaking one language over another. In other cases, speakers may reject 
all words which are shared with other languages they know as being “not proper 
X” or “borrowed from Y,” even if those words are in general use or are not, 
in fact, borrowed. In one of my field sessions, one person rejected all present 
tense forms from a particular verb conjugation, on the grounds that they were 
“borrowed” from the local lingua franca. In fact, that form is common to a 
number of related areas in the region and the shared forms are retentions from 
an earlier protolanguage, and not borrowed forms. Moreover, the morphemes in 
question are used in different ways in the two varieties. 


5 Field Techniques 


Working in multilingual areas makes it especially important to have good data- 
gathering procedures. It is also especially important to document field notes and 
recordings. If more than one language is being worked on at a time, an undocu- 
mented collection can make it extremely difficult to sort out what language is what, 
especially if there is more than one undocumented language in the sample. 


5.1 General field techniques 


As Ihave stressed elsewhere in this chapter, many of the techniques used in field 
linguistics more generally are also applicable when working on language contact 
(see Newman & Ratliff 2001; Gippert, Himmelmann, & Mosel 2006; Crowley 2007; 
and Bowern 2008 for some advice). The linguist should take nothing for granted, 
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but test all assumptions. It is often necessary to keep several conflicting hypotheses 
in mind simultaneously. Work with multiple speakers, compare results, be aware 
of variation and homogeneity in the data. Keep good records. Learn to speak the 
language under study and use the intuitions you gain from familiarity with your 
data for more data testing. Look for patterns rather than individual features. 

Field sites with extensive linguistic variation are extremely confusing when the 
fieldwork is at an early stage. It is difficult enough to do fieldwork on a single 
language; that complexity is compounded when there is need to work not only 
on several languages at once, but also on the relationships between those languages. 
Many sites are not so complex, thankfully. 


5.2 Ethnographic methods 


Most of your information on language contact in a field site will come from three 
sources. The first is what you can infer from the languages themselves, from data 
and techniques such as those described in section 5.3. The second is interviews 
and self-reports. Observation is the third source of potential information. 

It is fine to ask people about their views of language and when they use 
particular items. You can ask about impressions about who speaks the “same” 
and “differently,” who uses particular varieties, which varieties are prestigious 
and which are stigmatized, and other information about language attitudes and 
perceptions of variation. Niedzielski and Preston (2003) is a discussion of folk 
linguistics and folk linguistic categorizations; Eckert (2000) is a very detailed 
example of the application of ethnographic methods in sociolinguistics (although 
not in a language contact area). 

What speakers identify as a contact might not be true contact. That is, while 
speakers may be sensitive to the similarities and differences between the languages 
they speak, they probably won't know the history of language contact and there is 
no a priori reason to take their statements about what are and aren’t borrowings 
at face value. 

Observation is also a very powerful tool for investigating language contact. It 
is a core tool in sociolinguistics and should be used in documentary / descriptive 
linguistics, and even in fieldwork aimed primarily at theoretical topics. After all, 
the best way to find out what is interesting in a speech community (and worth 
testing further) is to do some exploratory work. 

There is extensive discussion of observation and ethnographic methods in the 
sociolinguistic literature. Johnstone (2000), Milroy (1987), Meyerhoff (2006), and 
Wardhaugh (2010) are good references to begin with for further information. 


5.3. Grammaticality judgments, elicitation, and 
stimulus materials 


While ethnographic methods such as participant observation are invaluable 
for fieldworkers, direct questioning about linguistic data is also necessary. 
Sociologically oriented studies of language tend to play down the value of trans- 
lation and grammaticality judgments (and the like) on the grounds that they are 
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easy to manipulate unintentionally, difficult to quantify, and subject to unreli- 
ability. However, like all methods, elicitation is dependent on the skills of the 
person using the method, and these methods work better in some cases than in 
others. Furthermore, direct translation or elicitation is not the only method for 
obtaining linguistic data. The methods considered in this section fall into three 
types: translation/elicitation, controlled tasks, and content checking. 

In translation (or elicitation narrowly defined), the linguist asks the speaker of 
the language to translate sentences from the contact language into the target lan- 
guage. The translations are designed to give the linguist information about the 
different vocabulary and structures in the language. Problems with elicitation 
as the primary method for data gathering include biases of frequency (that is, 
elicitation gives the linguist information about (1) structures that are easy to 
translate, and (2) structures that the linguist thinks to ask about), and the possi- 
bility of data contamination from unnatural translations (that is, the translation 
is a literal one, but is not a felicitous expression in the language). Some of these 
problems can be avoided by being explicit about the task instructions, and by check- 
ing the answers with another speaker. 

Controlled creative tasks and stimulus materials avoid some of the problems 
mentioned in connection with direct translation, because they do not bias word 
or construction choice. Examples of such tasks include retelling the pear stories 
(Chafe 1980), frog stories (Berman & Slobin 1992), segmenting color wheels (Berlin 
& Kaye 1969), providing vernacular definitions or descriptions of items, and describ- 
ing video clips, pictures, or objects. The results are then transcribed and trans- 
lated into the target language and glossed for analysis. Such tasks are very useful 
in documentation because they allow the linguist to guide the topic and struc- 
tures for description without prespecifying them. However, it is difficult to get 
exactly parallel data this way. 

The third method is content checking. That is, the linguist asks specific ques- 
tions about constructed sentences or previously gathered data in order to find 
out more about the language. These questions might take the form of requests 
for grammaticality judgements (“Is X a good sentence?”), requests for more 
information about the meaning of a word or phrase, back-translation, or check- 
ing transcriptions and translations from earlier sessions. All of these methods 
are likely to produce variable answers and can serve as the input to further 
investigation. 

Finally, consider the secondary uses of materials originally collected for 
another purpose. A well-annotated collection of data can be used for multiple 
projects. Once a set of sound files are segmented and tagged for a study on vowel 
length, for example, the same data can be used for consonant duration, or a for- 
mant analysis. A set of texts which have been edited to remove code-switching 
for a published version also serves as the basis for a study of that code-switching. 


5.4 Metadata and annotation 


You need some way of keeping track of all the data that you have recorded. Good 
data are made much more useful by good metadata (that is, data about the data; 
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see Bowern 2008: 56ff.). It is always true that annotated data are more easy to use 
than unannotated data. Consider the situation of a set of notes in an unknown 
language without translations versus one with the translations. Data from lan- 
guage contact situations are complex because they contain pieces from interlocking 
systems with different properties. Having good information about the data makes 
this easier. 

Keeping track of linguistically diverse data can be challenging. Data points 
should be associated with the speaker (or by speaker categories, when trying to 
work out what the patterns of variation are). That is, you should record where all 
your data come from. This means you will also need good speaker metadata: at 
least age (or approximate age), gender, language background, class, occupation 
(if relevant), and clan (if relevant). You will need to know where and when each 
session was recorded, and who else was present. 

Next, you will need to be able to find your data again, so you will need some 
way of keeping track of the topics you have recorded. If you use the same prompt 
materials with several speakers, it will be most useful if the data are transcribed 
in a format that allows direct comparison between the versions. Time-aligning 
transcriptions (that is, recording using a digital recorder and transcribing the record- 
ings using software which allows the sound clip to be lined up with the tran- 
scription) are now possible. There are several such free programs available. 


6 Conclusions 


In many ways, multilingual fieldwork is no different from fieldwork in a lin- 
guistically homogeneous community. Fieldwork in language contact situations is 
diverse, just like fieldwork everywhere. The best approach to a linguistically diverse 
fieldwork community is a diverse analysis toolkit which includes a variety of tech- 
niques coupled with good quality recording and annotation. Where feasible, 
such a toolkit includes observation and other ethnographic techniques, backed 
up with experimental and elicited data from a sample of the population. 


NOTES 


1 For our purposes, a broader definition is more useful than a narrow one, since we are 
considering the situations in which one language may influence another, and how that 
relates to data gathering in the field. It is not necessary to quibble over whether the 
use of Latin in mediaeval Europe constitutes language contact, when the contact does 
not involve two groups of speakers with different first languages, one Latin, one 
vernacular. 

2 That is, how many languages or “lects” (varieties) of a language a person controls. 

3 For the purposes of studying contact in fieldwork, we need not make a difference between 
contact between languages and contact between standard and nonstandard varieties. 
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4 Note that throughout this article “multilingualism” implies competence in more than 
one language (that is, it is a cover term for non-monolingualism); I am not distinguishing 


bilingualism from multilingualism here. 


5 Furthermore, language contact may leave its mark on a language but the populations 
may no longer be in contact with one another. 


6 See http://tgdp.org. 
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18 Macrofamilies, 
Macroareas, and Contact 


JOHANNA NICHOLS 


At great time depths it becomes hard to tell the areal from the genealogical. Sounds 
and meanings change, derivation is reshaped, words are lost, and grammar 
changes, leaving languages with structural resemblances that are unlikely to be 
due to typology, universals, or chance and must therefore probably reflect shared 
history — but cannot tell us whether the ancestral languages were sisters, neighbors, 
or both. Recent years have seen progress in identifying levels of borrowability 
and inheritability (Wichmann & Holman 2009; McMahon & McMahon 2005: ch. 4; 
Nichols 1995; 2001; 2003; 2005), and there is steady progress in typological work of 
all kinds, raising hopes that more precise histories will be attached to linguistic 
resemblances in the near future. For now, though, problems remain. This chapter 
surveys a number of the well-known proposed linguistic macrofamilies and 
considers the quality of evidence for their relatedness by descent and by areality. 
It also identifies some linguistic macroareas and some far-flung structural resem- 
blances that are good candidates for old contact zones. 

I use the term stock to refer to the oldest level in a genealogical lineage that is 
both demonstrably a family and reconstructable (in principle, i.e. displaying 
some regular correspondences; it need not have actually been reconstructed yet) 
(Nichols 1997). Every isolate counts as its own stock. Family is a general term for 
any demonstrated descent group at any level. Macrofamily refers to hypothetical 
or debated older groupings posited by comparing proven families; as will 
become clear below, not everything here called a macrofamily is demonstrably a 
family or even likely to be one. 

Criteria for genealogical relatedness are at least one strongly resemblant multi- 
element grammatical paradigm and/or enough resemblant (and _ sufficiently 
resemblant) shared roots to exceed chance (the individual-identifying threshold: 
Nichols 1996; 2009; see also Campbell 2003). Paradigmatic evidence was used to 
establish the relatedness of Afroasiatic and Algic (Greenberg 1960; Goddard 1975; 
Nichols 1996; 2009). As a rule of thumb, a four-member set with close formal and 
functional resemblance (like the four *CV personal pronominal prefixes of Algic) 
exceeds what can be expected by chance and is very strong and perhaps even 
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sufficient evidence of relatedness by descent. Long-range comparisons often 
compare single grammatical formatives, but only if they participate in a larger 
paradigm can grammatical formatives have probative value; otherwise they have 
the same probability of occurrence as individual lexemes. 

Shared lexical roots, i.e. putative cognates, are what is usually sought and offered 
as evidence of relatedness, but rarely do they exceed the threshold of chance in 
both quality and quantity. As a rule of thumb, when working with a fixed and 
closed word list like the Swadesh list, where exactly the specified 100 or 200 glosses 
must be looked up and one cannot cast about among words of similar meaning 
to find a closer formal match, if the segmental resemblances are constrained to 
fairly close similarity and no selective parsing, positing of metathesis, etc. is per- 
mitted, 5 out of 100 words or 10 out of 200 exceeds chance (95 percent confidence 
threshold) provided each word contains at least two consonants and a proper mor- 
phological analysis is done so that it is roots that are compared. If a modest amount 
of semantic casting about is allowed, the numbers required rise to 15/100 and 
28/200. If more phonological leeway and selective parsing are allowed the numbers 
rise to 47/100 and 88/200. If one-consonant items are compared, the numbers rise 
by about one required word per two compared words. (Grammatical formatives, 
compared one by one, are often among these one-consonant items.) In a the kind 
of comparison that is usually done in seeking evidence of relatedness, where 
considerable formal and semantic leeway is allowed and all available lexical 
resources are used (rather than a closed word list), over 400 putative cognates 
are required for a 1,000-word comparison (1,000 words is equivalent to a small 
field dictionary or glossary) and about 2,000 for 5,000 words (this is the size of 
a student’s dictionary or good-sized field dictionary; for these figures and their 
calculation see Nichols 2010). That is, where lexical resources are modest one needs 
a few hundred putative cognates, and where they are extensive a few thousand; 
the number is reduced if formal and semantic constraints are imposed. 

Areas are usually identified by their linguistic resemblances and defined cat- 
egorically: some set of features is found in all and/or only the languages of the 
area (e.g. Emeneau 1956; Masica 1976; Joseph 1983; Campbell, Kaufman, & Smith- 
Stark 1986; Enfield 2005). This ideal is unworkable for seeking traces of ancient 
areas, as former area members may have moved away, the area may be overlaid 
with nonconforming languages, and the areal features may have undergone 
change in previously conforming languages. Bickel and Nichols (in press), 
Gtildemann (in press), and Donohue and Whiting (submitted) define areas on 
nonlinguistic grounds (geography, etc.) and seek statistically significant differences 
in the frequency of features inside versus outside the area. Here I assume some 
kind of probabilistic definition of areality that makes it possible to consider 
noncategorical resemblances as diagnostic of now-inactive areality. 

Of course all families descend from earlier ancestors, and so on, and many 
families must have surviving distant sisters. Are the macrofamilies proposed in 
the literature the best candidates for these older groups? The following sections 
survey some genealogical characteristics of well-known macrofamilies and attempt 
to factor out likely genealogical from likely areal properties, and to distinguish 
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both from chance and universals. Very large macrofamilies for which no plausible 
support, either areal or genealogical, has been adduced are left out (primarily 
Amerind, Dene-Caucasian, Indo-Pacific), as are a number of smaller long-range 
comparisons which I cannot judge. A final important preliminary is that, in 
claims of genealogical relatedness, the burden of proof rests on the proponent. 
That burden has usually not been shouldered for macrofamilies, so the sections 
below assess the probative value of what evidence has been offered. 


1 Africa 


Africa is the clearest case, as the work of deciding whether macrofamilies are 
genealogical or areal groupings is done in Giildemann (in press). In the view of 
nonlinguists (and some linguists), African languages fall into the four macrofamilies 
of Greenberg (1963): Afroasiatic, Niger-Kordofanian, Nilo-Saharan, and Khoisan. 
Of these only Afroasiatic is a proven family (subsuming the Berber, Chadic, extinct 
Egyptian, Semitic, and Cushitic stocks, and Beja and/or the Omotic family if those 
are not Cushitic). It is a family of very great age, undoubtedly the world’s 
oldest demonstrated family: the Semitic and Egyptian branches are attested in 
writing from about 4,500 years ago, and the oldest Semitic languages are of 
West Germanic-like or Romance-like diversity, giving Semitic alone an Indo- 
European-like age; the relationship between earliest Egyptian and Proto-Semitic 
is considerably more distant than that between Romance or Germanic languages, 
making Afroasiatic well over 8,000 years old at minimum on just this evidence. 
Though relatedness of the whole family can be demonstrated on paradigmatic 
and other grammatical evidence (Greenberg 1960; Newman 1980), lexical evidence 
is lacking. Evidently the cumulative effects of sound change, semantic change, 
and vocabulary replacement have made it impossible to identify recurrent 
correspondences in sufficient numbers to reconstruct sounds and vocabulary.' 
Without this, subgrouping of the major branches is impossible. 

Afroasiatic is a family, but it also bears a geographical and a typological 
description. Gtildemann (in press) observes that three of Greenberg’s macrofami- 
lies were based on typological criteria and all three contribute heavily to defining 
areas. Giildemann proposes five macroareas in Africa, listed below, each centered 
chiefly on one of Greenberg’s macrogroups but all involving several different fami- 
lies and more than one macrogroup. The defining properties of each macroarea 
are typologically rare features that are frequent in the area but rare or lacking 
outside of it. The strongest diagnostic features are rare worldwide but for the area; 
but Giildemann also admits continental diagnostics which are frequent in the area 
but otherwise rare in Africa (though not necessarily rare worldwide). 


1 Macro-Sudan Centered on Greenberg’s Niger-Kordofanian, and geographic- 
ally on the southern Sahel and nearby savannah zone. Characteristic features 
are (see Gildemann 2008 for details): obligatory logophoric marking; labial- 
velar consonants; ATR (advanced tongue root) vowel harmony; complex tone 
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systems (three or more levels); vowel nasalization; S(Aux)OVX word order; 
comparatives using ‘exceed’ as comparative marker; implosives; V-O-NEG order; 
and labial flap consonants. 

2 Kalahari Basin Centered on Greenberg’s Khoisan, see Giildemann (1998). 
Features are clicks; ejectives; aspirated stops; phonotactics whereby clicks and 
other strong consonants are only root-initial; head-final genitives and head- 
final NP morphology in general; and lack of subject cross-reference (though 
several languages have object cross-reference). 

3 Chad-Ethiopia Disproportionately Afroasiatic: head-final syntax (a continental 
diagnostic); complex predicates using light verb ‘say’; peripheral case, i.e. three 
or more cases (a continental diagnostic); polar questions marked by affixes; 
no /p/ (this trait is also in Berber). 

4 The Berber spread zone In the Sahara, filled historically by the recent spread 
of Berber, a branch of a Chad-Ethiopia family (Afroasiatic) and now under- 
going large-scale shift in an ongoing spread of Arabic. 

5 The Bantu spread zone In much of the south, filled now by the fairly recent 
spread of Bantu, one branch of the Macro-Sudan family Benue-Kwa. This spread 
is ongoing, engulfing the languages of the Kalahari Basin. 


Thus all four of Greenberg’s macrofamilies are strongly associated with areal 
types, and all five macroareas are centered on macrofamilies. Each area has a 
distinct but not family-defining structural profile. Only one of the macrofamilies 
is a demonstrated family (Afroasiatic), and it is extremely old and just barely 
detectable by traditional methods. Two of the active areas, Chad—Ethiopia and the 
Kalahari Basin, may have formed a single greater Rift Valley area in prehistory. 


2 Eurasia 


Long-range comparison in Eurasia has centered for decades on the macrofamily 
called Nostratic, originally proposed by Holger Pedersen in the 1920s to account 
for resemblances among the Indo-European, Uralic, Turkic, Tungusic, Mongolian, 
Yukagir, Eskimo-Aleut, and Afroasiatic stocks. Recent works are Dolgopolsky (1998), 
Bomhard and Kerns (1994), and Greenberg (2000; 2002) (where a similar grouping 
is called Eurasiatic). These studies include Indo-European, Uralic, Tungusic- 
Turkic-Mongolian (itself a macrofamily), Korean, and Japanese in this macrofamily; 
Dolgopolsky’s Nostratic also includes Kartvelian, Afroasiatic, and Dravidian, 
Bomhard’s also Sumerian, and Eurasiatic also Nivkh (Gilyak), Ainu, Chukchi- 
Kamchatkan, and Eskimo-Aleut. These works offer almost exclusively lexical 
evidence. Bomhard and Kerns (1994) and Dolgopolsky (1998) are carefully worked 
out correspondence systems with extensive cognate sets (Dolgopolsky has 2,300 
reconstructed Proto-Nostratic roots, Bomhard 652).? Greenberg (2000; 2002) has 
72 grammatical formatives as putative cognates, most of them monoconsonantal 
or even monosegmental, and 440 lexical items, variously with one, two, or three 
consonants. Greenberg does not seek regular correspondences and therefore his 
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items have considerable formal latitude; Dolgopolsky and Bomhard & Kerns 
do seek regular correspondences, but since their correspondences are not unique 
they also count as wide-ranging resemblants; all these works allow considerable 
semantic latitude. All draw heavily on stock protoforms (Proto-Indo-European, 
etc.), which increases the chances of attestation and the formal and semantic 
latitude. Given the extensive lexical sources available for many of the languages 
and the large number of languages and stocks compared (any few of which may 
contribute evidence for one or another putative cognate), the numbers of items 
they present are one or two orders of magnitude too few to exceed chance. 

There are good areal traits (as defined above) among these languages: m-T 
pronouns, i.e. personal pronoun sets with m in the first person and an apical 
obstruent in the second (Nichols & Peterson 2005); strict head-final syntax even 
in NPs (Dryer 1989, 1992); front rounded vowels (Maddieson 2005; Crothers 1976); 
and several other features (Bickel & Nichols 2003; Nichols & Peterson 2005). Several 
of these are secondary in the language families in which they appear (for the 
pronouns see Comrie 1998; Vovin 1998; Nichols 2001). They define a Eurasian 
macroarea that centers around southern Siberia and has expanded because of 
post-Neolithic language spreads along the Eurasian steppe and Silk Road (Bickel 
& Nichols 2003; in press). Long contact, language shifts, nomadic residence 
patterns, military-ethnic organization, cross-language continuities of clans and hence 
marriage patterns, and the durable sociolinguistic and demographic dominance 
of cattle and horse breeders over their neighbors, combined with repeated lan- 
guage spreads and shifts on the steppe, have combined to make southern Siberia, 
Mongolia, and northwestern Manchuria an epicenter of linguistic contact and dif- 
fusion in the greater Silk Road area (e.g. Nichols 1998;3 Janhunen 1996a; 2001; for 
the modern situation Grenoble & Whaley 2006: 70-8). 

An identifiable cluster in this large area is the Altaic macrofamily, consisting 
of the strikingly similar and distinctive Turkic, Mongolian, and Tungusic families 
and the morphosyntactically similar Japanese and Korean. The first three have 
been in intense contact and share many diffused words over this long chrono- 
logy. The prehistory of Japanese and Korean is unknown, but they have not been 
in direct contact for at least the millennium and a half of known Japanese history 
and writing. Claims of their relatedness rest mostly on lexical evidence. Whitman 
(1985) has 352 Japanese—Korean proposed cognates with rigorously worked out 
regular correspondences and close lexical semantic resemblances; even given the 
copious lexical resources available for both languages, this is probably in the right 
order of magnitude to be diagnostic.* Robbeets (2005) finds 359 two-consonant 
lexical resemblances and 14 shorter grammatical formatives displaying regular 
correspondences and with moderate semantic latitude between Japanese and 
one or more of Korean, Turkic, Mongolian, and Tungusic, when (given the 
copious lexical resources, semantic latitude, and choice of languages) somewhat 
over 1,000 would be needed. Both of these works are comprehensive and rigor- 
ous, so that the lexical matches they present are reliable, but only Whitman’s 
may be numerous enough to exceed chance, and that only if it is certain that none 
are loans.” 
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Oswalt (1998), using a shift test of the Swadesh 100 word list, finds non- 
significant resemblance for core Altaic but significant resemblance on another 
100 words less resistant to loss and more prone to borrowing. Given that for 
all of Altaic as well as for just Japanese—Korean the structural resemblances are 
stronger than the lexical ones, and the structural resemblances are part of the wider 
Eurasian areal resemblances discussed below, it is more parsimonious to regard 
the resemblances as reflecting language contact in late prehistoric Manchuria and 
nearby. Janhunen (1996b) finds evidence for a possible deeper Mongolian—Tungusic 
genealogical link overlain by much borrowing. 

Indo-Uralic is a promising prospect for an older family. Indo-European and Uralic 
share enough items on a closed word list to exceed chance on the calculations 
of Ringe (1998) and Oswalt (1991), and Uralic and conservative Indo-European 
languages share the rare structural feature of case-number coexponence (Bickel 
& Nichols 2005). 

Fortescue (1998) presents structural evidence for an area straddling Beringia and 
extending well into North America and Siberia. Within it there is good evidence 
for relatedness of Chukchi-Kamchakan to Eskimo-Aleut and a wider connection, 
not necessary genealogical, to Uralic. He calls this kind of possibly family-based 
area a mesh. But there are grave typological obstacles (Janhunen 2001). 

Vajda (2010) presents evidence for relatedness of Na-Dene (North America) and 
Yeniseic (central Siberia) which meets the individual-identifying threshold both 
lexically and grammatically (Nichols 2010; Kari & Potter 2010). The structural type 
of Dene-Yeniseic is so very different from both Siberian and North American profiles 
that this long-distance link is likely to result from a migration of one or the other 
branch than to reflect ancient areality. 

Interior Asia has been a center of language spread at least since the Neolithic. 
The linguistic evidence points to strong and long-term areality in the epicenter 
of spread, with innovations made in the center eventually showing up farther away. 
To judge from its distribution, the m-T pronoun type may have spread early and 
then developed its strong structural parallelism in later innovations in the center; 
case—number coexponence is found at the far peripheries of the area (besides Uralic 
and Indo-European it also occurs in Chukchi and West Greenlandic), but for at 
least the last few millennia the classic agglutinating type (with monoexponential 
and transparently segmentable suffixes) has predominated in the epicenter. 
Phonemic front rounded vowels may have spread from the epicenter more recently. 
The consistently head-final morphosyntax of Uralic, core Altaic, Japanese, etc. is 
more generally widespread in Eurasia and not specific to this northeastern area. 


3 New Guinea 


The Trans-New Guinea (TNG) macrofamily was initially proposed on the basis 
of grammatical and lexical resemblances found across much of New Guinea, 
as one of the early efforts to make genealogical and typological sense of this 
genealogically most diverse of the world’s areas. For the history of the grouping 
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see Ross (2005), Pawley (2005). Ross (2005) — and earlier and forthcoming work 
referenced there — surveyed pronominal systems in all available languages of New 
Guinea (a total of 605) and all available or evident protosystems and reconstructed 
a six-member person-number paradigm plus some other forms as ancestral to TNG 
(about half of his languages). The pronominals include, among other forms, first 
person singular *na and second person singular *yga. Ross also searched world- 
wide for pronoun systems with first person n and second person y, and found 
recurrent cases only in Afroasiatic and Algic (North America). The paradigm itself 
may meet the individual-identifying threshold (depending on how firm the family 
reconstructions are), and the geographical distribution is diagnostic, with far more 
pronominal systems of this type clustered in New Guinea than would be expected 
by chance. The Afroasiatic and Algic cases show that such systems can be stable 
and possibly diagnostic over long periods of time. Ross finds the TNG system to 
be a sound preliminary diagnostic which, if not singlehandedly probative, identifies 
TNG as a likely family warranting further work and efforts at reconstruction. The 
recurrent TNG system stands out against the great variety of other pronominal 
systems among non-IT'NG languages. Pawley adds some further grammatical 
properties identifiable with the TNG languages, including clause chaining with 
a distinction of medial from final verbs, and a very few possible cognates. 

The TNG languages are spread across the central cordillera of New Guinea includ- 
ing the densely populated highland valleys, extending to nearby lowlands in some 
areas, and also on eastern Timor and nearby Alor. The consensus from Ross (2005) 
and Pawley (2005) — and other contributors to Pawley et al. (2005) — is that ancestral 
TNG languages are likely to have spread across the highland valleys when those 
first warmed to the point of habitability in the time frame of the first agriculture 
there (about 10,000 years ago: Denham et al. 2003; Denham 2005). Since they seem 
to have maintained dense populations from early on, TNG speakers spread into 
nearby lowlands, absorbing or displacing the languages of the still hunter-gatherer 
lowland peoples and generally raising the population of New Guinea to the point 
that Austronesian languages have made minimal inroads there (in contrast to nearby 
islands). The early time frame of the TNG expansion is consistent with other datable 
prehistoric events of New Guinea and with the general lack of evident cognate 
or even resemblant vocabulary. TNG then appears to be an Afroasiatic-like family, 
considerably older than known stocks and in view of its probable great age unlikely 
to be reconstructable. 


4 Australia 


Australia is an areally clear but genealogically uncertain case. The continent is 
short-coasted, mostly dry, and small for a continent, hence a natural spread zone, 
and it seems to have received no new immigration since the end of glaciation, 
i.e. to have been entirely isolated for about 8,000 years and nearly so for 16,000 
years. This means that extinction as a consequence of spreading must have played 
a large role in its linguistic prehistory, and the recoverable genealogical facts point 
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in that direction. There is strong continent-wide areality in phonology. Over four 
fifths of Australia is covered by the large Pama-Nyungan stock, and a small (and 
better-watered) stretch in the far north contains the over 20 non-Pama-Nyungan 
stocks plus Pama-Nyungan outliers and one or two stocks that are likely first 
sisters to Pama-Nyungan. In the most recent spread episode, Pama-Nyungan spread 
perhaps some 5,000-—6,000 years ago over a continent already inhabited for over 
40,000 years, in part bringing technological and cultural advances and in part recol- 
onizing desert areas abandoned during a prolonged drought, and spreading by 
a combination of language shift and demographic expansion. (For the linguistic 
prehistory see McConvell 1996; Evans & Jones 1997; Evans & McConvell 1998.) 
Progress has been made toward reconstruction in some of the Pama-Nyungan 
branches and the whole family (see chapters in Bowern & Koch 2004). 

Relative to its size the small northern area is comparable in language family 
diversity to New Guinea, which it resembles to some extent in climate and ecology. 
The northern languages may eventually be reducible to perhaps 10 stocks, still a 
diverse group. However, they exhibit many overall similarities of verb structure 
and verb prefix forms. There is also near-universal multilingualism and much intense 
local areality (e.g. Heath 1981), and several cases where languages have been shown 
to be related on the evidence of shared morphological patterns despite extremely 
few lexical cognates (e.g. Green 2003). Most Australianists assume the northern 
families are all deeply related to each other and to Pama-Nyungan. Harvey 
(2003) reconstructs a full four-person, two-number pronominal prefix paradigm, 
easily enough to establish shared descent if the reconstruction is supported 
within stocks and if paradigmatic pressures and diffusion of patterns can be firmly 
ruled out. These pronouns bear suggestive resemblances to the Pama-Nyungan 
pronouns posited by Blake (1988); see now also Alpher (2004). It is notable that 
there seem to be exactly two pronominal paradigms manifested across Australia, 
the Pama-Nyungan type and the northern type (Blake 1990: 441). For the northern 
languages see Evans (2003). 

Australian languages show normal descent relations overlaid with strong 
contact phenomena, and comparative work is beginning to sort these out. But what 
has mostly stamped the linguistic history of Australia is extinction, and this has 
been little studied. The combination of large-scale extinction in the south with 
the Pama-Nyungan spread, gradual Pama-Nyungan northward encroachment, the 
probably high incidence of language shift that accompanies long-term mulltilin- 
gualism, much local diffusion, and no linguistic immigration might well have 
reduced the continent’s linguistic population to descendants of just one ancestor. 
There is no way to know how many linguistically unrelated immigrations to the 
ancient Australia-New Guinea landmass may have occurred or what the typo- 
logical and genealogical diversity of Australia may have been when the postglacial 
sea-level rise began some 16,000 years ago, but it is safe to assume that Proto- 
Australian, should it prove to be a reality, was not the original and only linguistic 
immigrant to Australia over 40,000 years ago but is simply a sole survivor which 
has managed to preserve some secondary diversification at the thickest and most 
hospitable edge of a spread zone. 
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5 North America 


Hokan and Penutian are two well-known macrofamilies which were assembled by 
accretion beginning in the early twentieth century and whose more or less canonical 
characterization was by Sapir (1929), for whom the construction and derivation 
of the stem is possibly the most important macrofamily-identifying feature. Stem 
structure has received much less attention in the subsequent literature, which has 
mostly sought lexical resemblances and recurrent correspondences among them. 


5.1 Penutian 


Penutian, of which I consider primarily the California and Plateau Penutian 
(CPP) core, comprises five stocks of mostly interior California and the Columbian 
Plateau (central and eastern Oregon and eastern Washington): Yokutsan (south- 
ern and central California), Utian (Miwok-Costanoan, central California), Wintun 
(north central California), Maiduan (northeastern California), and Plateau Penutian 
(Klamath-Modoc, Sahaptin-Nez Perce, and possibly Molala). More distantly 
related may be the Oregon Penutian languages (Takelman, Coosan, Siuslaw, Alsea, 
Chinookan) and possibly Tsimshian (British Columbia). CPP has a distinctive gram- 
matical profile in North America: the languages are primarily dependent-marking 
with suffixal case, suffixing overall, with fairly simple verb morphology (little or 
no incorporation, valence-related rather than lexical affixation); simplex stems are 
long, often disyllabic or triconsonantal, and many kinds of ablaut and other internal 
stem change accompany inflection and derivation (Sapir 1929/1949; Silverstein 
1979; Berman 1983; Callaghan 1997: 49; DeLancey & Golla 1997; Golla 1997: 167; 
summary and critique of the evidence in Campbell 1997: 309-22). The ablaut 
patterns are potentially very strong evidence of relatedness, and CPP can be 
considered an Afroasiatic-like family if specific ablaut patterns can be shown 
to recur across the family. Callaghan (1997; 2001) finds about 150 promising 
cognate sets showing recurrent correspondences (the right order of magnitude, 
given the formal and semantic closeness and large field lexica), three ablaut 
patterns, a reduplication pattern, and other stem resemblances between Yokutsan 
and Utian, so these two stocks at least are very likely to be sisters. Apart from 
this, lexical and grammatical evidence between stocks is so far not probative 
(Campbell 1997; Shipley 1980; Callaghan 2001).° Good comparativists have worked 
with adequate lexical and grammatical resources, and if CPP-wide correspondences 
have not surfaced yet it may be that they never will. 

Archeological and linguistic evidence point rather firmly to a shared origin of 
all the CPP stocks in the formerly lacustrine and marshy environment east of the 
Cascades and Sierra Nevada in southern Oregon and northwestern Nevada, and 
separate entries of Utian, Yokutsan, Wintun, and Maidun to interior California 
impelled in part by the gradual dessication of the homeland (Whistler 1977; Shipley 
& Smith 1979; Moratto 1984; Aikens 1994; DeLancey & Golla 1997: 180). Utian may 
have entered California 5,000 years ago and seems to have been in the central 
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part of the California Central Valley by about 4,500 years ago; Maidu may have 
entered as recently as 1,000 years ago. Even in wetter times the Columbia Plateau 
and nearby lacustrine areas are likely to have been spread zones, so extinction 
should have occurred fairly regularly, which is consistent with the shallow inter- 
nal time depths of the Wintun, Maiduan, and Yokutsan groups relative to the 
age of the entire CPP. The Yokutsan family is quite shallow, probably reflecting 
a recolonization of the southern Central Valley after a prolonged drought that 
ended around 1200. 

Could the resemblances among the CPP languages be due to areality and con- 
tact in the lacustrine zone? Dependent marking and suffixation can diffuse, but 
stem composition types, stem shapes, and stem-internal alternation patterns are 
quite unlikely to diffuse and are excellent family markers (precisely these were 
the critical elements in Callaghan’s demonstration of Miwok-Costanoan unity, 
Callaghan 1997: 56). Long-term contact could of course have favored their reten- 
tion. Further evidence against an exclusively contact-based account is the fact that 
Washo, a language isolate of the Lake Tahoe region, is also a former lacustrine- 
zone language that retreated uphill during the latest drought (Aikens 1994), yet 
in most of its structure it is utterly unlike CPP. The best analysis seems to be that 
CPP languages are the dispersed and decimated remnants of a once-continuous 
language family which for a very long time occupied most of the lacustrine and 
marshy environments of interior Oregon and California and northwestern Nevada 
and at contact still occupied those that were still productive. The Yokuts—Utian 
connection is barely in the range of demonstrability; the others are more distant, 
so that CPP is at best Afroasiatic-like. Still, descent explains its structure and dis- 
tribution better than contact or chance. 


5.2 Hokan 


Hokan is a less compact assortment of (in current views) 11 stocks (most of 
them isolates or small families) from all around California: from north to 
south, Chimariko, Shastan family, Karok, Palaihnihan family, Yana, Washo, 
Pomoan family, Esselen, Salinan, Yuman-Cochimi family, Seri; possibly also the 
Tequistlatecan small family of Mexico. Unlike Penutian, Hokan has no distinctive 
structural profile offering potential family markers, and consequently no clinch- 
ing paradigmatic evidence of relatedness between any of its putative branches. 
For overviews see Jacobsen (1979), Poser (1995), Jacobsen and Langdon (1996), 
Campbell (1997: 290-6), Mithun (1999: 303-4). Scholarship has focused mostly 
on lexical evidence and has found far too few likely matches to be probative, 
and few recurrent correspondences. Examples are Jacobsen (1958; Washo and 
Karok only, 121 matches, some monoconsonantal; in the right order of magni- 
tude for his word list and levels of agreement) and Kaufman (1988; a compre- 
hensive list of putative protoforms but unfortunately without data included, 
so the parameters cannot be determined). Some of its own best advocates 
and family specialists regard Hokan as unproven and possibly unprovable but 
a fruitful hypothesis that spurs interfamily comparison and areal research 
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(Jacobsen & Langdon 1996). On the other hand, the evidence is such as to have 
enabled researchers to agree that the Chumashan and Yuki-Wappo families are 
not Hokan. 

Structurally, Hokan languages are unremarkable in North America: for the most 
part head-marking but not polysynthetic, often stative—active, usually with a mix 
of prefixation and suffixation and without highly complex verbal morphology. 
Hokan has a residual geography in California, occupying mountains and coasts 
wherever Uto-Aztecan and CPP languages have not spread, and (together with the 
not even macro-classified Chumashan and Yuki-Wappo families) it undoubtedly 
represents the earliest linguistic stratum of California (Whistler 1977; Jacobsen 1979: 
547; Moratto 1984: 550-2; Conathan 2004). This would put its time frame well 
beyond the reach of the comparative method if it is a family. Many of the lan- 
guages have been in contact with other Hokan languages, and it is hard to dis- 
tinguish loans and contact phenomena from possible inheritances (Mithun 1999: 
304). All in all this set of languages seems to reflect neither shared descent nor 
notable areality, but simply very long residence in situ and as neighbors. 


6 South America 


6.1 Quechumaran 


Long-range classifications and several others have placed the Aymaran and 
Quechuan families of the Andes highlands into a single Quechumaran macrofamily. 
For a review of the debate see Campbell (1997: 273-83). The evidence presented 
both pro and con is partly typological and partly lexical. Typological evidence 
includes pervasive and deep-seated resemblances in morphological type, position 
classes in the verb, structure of the reconstructed Proto-Aymaran and Proto- 
Quechuan sound systems, phonotactics and syllable structure, all of which 
Campbell correctly identifies as diffusable and not diagnostic of descent, and none 
of which by itself appears to be particularly unusual in South America (the extremely 
complex verb morphology shared by the two is unusual, however). Orr and 
Longacre (1968) propose 255 Quechuan-Aymara putative cognate sets with regu- 
lar correspondences, but these prove to be mostly sharings due to borrowing: 
“Given the virtually identical form of the shared items, the radically different char- 
acter of the remainder of the lexicon is left unexplained” (Adelaar & Muysken 
2004: 35), and the method involved not first demonstrating likely relatedness but 
seeking the most resemblant sets and then seeking recurrent correspondences in 
those (Hardman de Bautista 1985; Campbell 1997: 280-1). McMahon et al. (2005; 
reported in McMahon & McMahon 2005: 165) do a controlled lexicostatistical 
comparison of resemblant sets among 30 genealogically very stable and 30 
genealogically less stable glosses in Quechuan and Aymaran,; on the first set the 
two families prove quite discrete while on the second set they are very close with 
much reticulation, supporting the contention that the two families are related by 
borrowing and not descent. 


372 Johanna Nichols 


Campbell (1995) compares pronominals and some inflectional forms in Proto- 
Quechuan and Proto-Aymaran and decides that, though most of the lexical 
and structural resemblances are best ascribed to contact, there are fundamental 
grammatical resemblances that suggest deep genealogical relatedness as well. His 
examples and discussion show that his comparanda are mostly monoconsonantal, 
he defines consonants very broadly (in some cases suggesting sound changes that 
might unify phonologically very different forms), and he uses selective parsing 
to segment resemblant consonants from longer forms. His verbal affixes seem to 
be selected from the presumably very large set of verbal affixes in both languages, 
and are too few to be diagnostic. The personal pronoun forms, however, are poten- 
tially diagnostic: viewed as a four-member closed set of just the independent pro- 
nouns, they are not probative but would be if the selective parsing were removed 
and the identity of the consonants made more precise. Precisely these goals 
would be reached if internal and comparative reconstruction within the two 
families could make the consonants’ identities more secure and firmly motivate 
the segmentation. Campbell is right to conclude that further historical work is 
warranted. 

Ancestral Aymaran was the lingua franca of the Tiwanaku empire in the 
Andes and Quechuan was that of the succeeding Inca empire and further spread 
by its use as lingua franca in the early colonial period. Before the imperial 
spreads the two protolanguages were probably near neighbors in contact, and they 
have been been in a relationship of extensive language shift and bilingualism for 
at least the last 500 years and probably 1,500. The relatedness that Campbell may 
have detected antedates this, perhaps considerably. If the relatedness stands up 
to scrutiny, the high Andes are a case of deep sisters in intense areal convergence, 
rather like the Balkan Sprachbund. A further similarity is that both Quechuan and 
Aymaran have discontinuous distributions over large parts of the area, so that 
local varieties are in contact with local varieties everywhere (or were before the 
colonial era). 


7 Conclusions 


Historical linguistics not only is able to identify active language areas and 
language families up to the level of the stock, but also has the tools and infor- 
mation to identify genealogical and areal groupings going a step or two beyond 
these: Afroasiatic-like families (detectable but not reconstructable) like Afroasiatic 
itself, probably Indo-Uralic, and Utian (actually likely to be a stock), still older 
groupings that are good candidates for Afroasiatic-like families (Altaic, with or 
without Japanese and Korean; CPP, TNG, Quechumaran), remnants of former areas 
(the ancient greater Rift Valley, inner northeast Eurasia, possibly Hokan), and 
more dispersed and probably older large areas (the Caucasus—Himalayas enclave 
set, the Pacific Rim: see Bickel & Nichols 2003; in press). Long-intergrown com- 
binations of areality and shared descent can be factored out, as shown above for 
Macro-Sudan, Chad-Ethiopia and Khoisan, Altaic, CPP, and Quechumaran. 
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Still puzzling are the large geographical groupings identified by resemblant 
pronouns: inner Eurasia (m-T pronouns), TNG (n-7), Australia (ng and possibly 
ny), and the Pacific Rim especially in the Americas (n-m; Nichols & Peterson 2005). 
The TNG forms are part of a six-member system which identifies a likely family, 
but the others are two-member paradigms, not extensive enough to be diagnos- 
tic. The Eurasian m-T pattern becomes less rather than more resemblant as one 
reconstructs within families (Comrie 1998; Vovin 1998). The American n-m one is 
found in all three of CPP, Hokan, and Quechumaran, among others, so if it is 
inherited it comes from much earlier times than even these. All of the patterns 
involve counterposed high-frequency consonants which, in small closed paradigms 
like pronouns, are functionally optimal (Rhodes 1997) and share a sound-symbolic 
basis with mama-papa vocabulary (Nichols 2001), factors which, in principle, 
should favor both inheritance and diffusion. For all of the systems, however, the 
geographical distributions are nonaccidental, so they could well be the last, best 
identifiable surviving inherited trait in very old families. 

If grammatical evidence can identify Afroasiatic-like families, lexical evidence 
cannot. Ultimately, when relative stabilities of various lexemes are better under- 
stood, procedures like those used by Oswalt (1998) and McMahon et al. (2005), 
comparing high-stability to low-stability vocabulary, should make this possible. 
A putative family with significantly more resemblances among high-stability 
than low-stability lexical items could be considered a firm family whether or not 
reconstructable. 

To answer these and other questions and take many language families and areas 
back to Afroasiatic-like and Inner Eurasian-like time depths we need much more 
internal and comparative reconstruction within families and more work on the 
relative propensity to diffusion, inheritance, loss, and spontaneous innovation of 
various structural features. We need to know whether, as seems likely, inherited 
features, when given areal support, can outlive the life expectancy they would 
otherwise have. A more fundamental need is of course more and better docu- 
mentation of the world’s languages. 


NOTES 


1 Ehret 1995 and Orel and Stolbova 1995 both offer thoroughly worked out Proto- 
Afroasiatic reconstructions with sound correspondences and cognates. The two recon- 
structions are incompatible in the sense that correspondences differ and therefore what 
is recognized as cognate to what differs. Since they are incompatible, at least one of 
them must be wrong. This proves that regular correspondences do not necessarily demon- 
strate genealogical relatedness. I believe this point was first made by Ratcliffe 2003. 

2 According to Bomhard (1999: 48-9) the two reconstructions are imcompatible. As with 
Afroasiatic, this proves that at least one of them is wrong and therefore correspondences 
and (putative) cognates are no guarantee of actual common descent. 

3 Ihave retracted the PIE homeland proposed there but the rest of the analysis stands, 
including Eurasian steppe sociolinguistics and spread patterns. 
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4 If the consonant matches are taken to be identical consonants (in view of the regular 


correspondences), then 352 is sufficient. 


5 Based on a quick survey of a random 10% of both sources, up to 15% of Whitman’s 
cognates and up to 25% of Robbeets’s may be one-consonant roots. 

6 Liedtke 2007 presents 87 sets with close formal and semantic resemblance as evidence 
of Wintuan-Sahaptian relatedness (and 64 for Wintuan—Klamath). Even given the rich 
lexical sources available on these families, these figures meet the threshold because of 


their formal and semantic stringency. 
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19 Contact and Prehistory: 
The Indo-European 
Northwest 


THEO VENNEMANN 


The Indo-European languages, spread in ancient times over large parts of Eurasia 
and in recent times over large parts of the world, have emanated, according to the 
views of nearly all specialists, in prehistoric times from a small territory some- 
where near the seam between Asia and Europe. Since all other parts of the double 
continent suitable for human settlement were already populated by non-Indo- 
European peoples by the time of the Indo-European expansion, all or nearly all 
historical Indo-European languages must have been in contact with non-Indo- 
European languages in prehistoric times, and this is certainly so for Europe north 
of the Central Divide, the mountain ranges separating central, west, and northern 
Europe from southern Europe: the Pyrenees and the Alps. 

While this much is clear, it is less clear what the prehistoric non-Indo-European 
contact languages of Indo-European might have been. It is, indeed, least clear for 
the Indo-European languages north of the Central Divide. For while we possess 
suggestions, most of them vague and controversial, in the writings of ancient authors 
concerning possibly non-Indo-European contact languages, and in part even 
texts or individual names and expressions from those languages, these materials 
relate almost exclusively to the situation in southern Europe: Pelasgian in Greece, 
Etruscan in Italy, Iberian in Spain, to mention but three of those languages. For 
Europe north of the Central Divide there exists hardly anything, and for a huge 
territory comprising most of the German, French, English, and North Germanic 
speaking countries, there is close to nothing at all. 

How does one identify possible earlier substrates of languages in territories where 
no such indigenous languages have survived? One looks for historical analogs 
and derives a rule of thumb. One analog is the question of the substrate of English 
in England. Here we know that the substrate was the language which survives in 
the mountainous regions of the west, Brittonic or Cymric, surviving as Welsh 
in regions of Wales. Another analog is the question of the substrate of English in 
Ireland. Here we know that the substrate was the Irish language which survives 
natively in a few remote parts of Ireland, the Gaeltracht. Yet another analog is 
the pre-Celtic Pictish language of the Celtic Isles: It survived longest in northern 
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Scotland and possibly in certain parts of Ireland. Then there is the language 
spoken in Bavaria before her Germanization in the sixth century. It was the 
Raeto-Romance language now only spoken in remote parts of the Alps. Clearly 
the rule of thumb for the identification of earlier substrates is this: Substrates of 
intruding languages survive longest on economically uninteresting or hard-to-access 
fringes. As the examples show, these fringe areas usually lie on the opposite side 
of where the invasion began. 

So let us look for a non-Indo-European language on the economically uninter- 
esting or hard-to-access fringes of Europe north of the Central Divide, especially 
in the far west because the Indo-Europeans intruded into Europe from the east. 
The area that first comes to mind is the Alps. But the pre-Indo-European language 
there, Raetian, was most likely related to Etruscan and was thus itself the lan- 
guage of intruders, viz. from the Aegean. That leaves us with the Pyrenees. And 
there we find a non-Indo-European language that is not assumed by anyone with 
convincing arguments to be the language of recent intruders but is generally 
assumed to be indigenous to that part of the world: Basque. Our rule of thumb 
tells us that Basque is a survivor of the substrate of the Indo-European languages 
north of the Central Divide. 

This is as it should be: During the last glaciation Europeans north of the Central 
Divide could only survive in southern France (and, farther away, in the Balkans). 
When the ice receded, Europe north of the Central Divide was repopulated by 
those survivors. The language of southern France was Basque; the southwest was 
Basque even in Roman times. The Vasconicity of Europe north of the Central Divide 
was demonstrated with linguistic means in the early 1990s. It was corroborated 
by genetic research soon afterwards. The two maps, one linguistic, one genetic, 
are almost identical. 

In the present paper, only linguistic influences will be discussed, to be more 
precise: a selection of structural influences. For the numerous lexical influences, 
especially in the toponymy of Europe, reference is made to Freche (1995), Appelt 
(1998), Réder (2000), Bohm (2003), Welscher (2005), Vennemann (2006a; 2006b; 2008; 
and most of the chapters in Vennemann 2003). 


1 Vigesimality 


A prototypically structural property of languages is the way the larger numerals 
are constructed in them." Since the number of natural numbers is infinite and even 
the number of large numbers used in languages of advanced civilizations is too 
large to name them individually by means of simplex words, principles by which 
to construct designations of large numbers starting from the individually named 
small numbers are needed. What we therefore find in such languages is an iso- 
latable portion of the grammar serving this purpose: a primitive vocabulary of 
its own designating the smallest numbers and a set of rules for the sole purpose 
of constructing the larger numbers needed by the speakers of the language, in 
principle any natural number of arbitrary magnitude (“ad infinitum”). Some of 
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Linguistic map, northern part Genetic map, northern part 
Vasconic expansion Haplogroup V expansion 


Figure 19.1 Northern and eastern Vasconic expansion 

Linguistic map, northern part, Vasconic expansion* 

Genetic map, northern part, Haplogroup V expansion? 

a Cf. Vennemann (2003: xv) (originally prepared for Vennemann 1996) 

b “Map of Europe depicting the most likely homeland of haplogroup V and its pattern 
of diffusion” from Torroni et al. (1998: fig. 4) 


these rules may be of the word formation kind, others purely syntactic, depend- 
ing to some extent on the typological character of the language. Ideally such a 
system has simplex numbers from 1 to some number n (the “basic unit of count- 
ing”), then builds multiples of n filling up the gaps by adding the simplexes one 
by one, until 1 times n is reached. Then the addition continues. The multiples n 
times n, n times n times n etc. may be given names of their own. But the system 
really only becomes fully workable if the number zero is added and a written 
notation is developed. Mixed systems are common.” 

A curious feature of the western Indo-European languages is the occurrence of 
vigesimality, i.e. counting with 20 as a basic unit,’ either alongside or — partly — 
instead of decimality, counting with 10 as a basic unit.* I say curious because the 
basic unit of counting in English is ten (-teen, -ty), and this has been so since Proto- 
Indo-European times.° We count from 1 to 10, then add 1 (eleven) and again 1 
(twelve) and again 1 (thirteen) until we reach twice 10 (twenty), then further until 
we reach three times 10 (thirty), then forty, fifty and so on up to 10 times 10 (one 
hundred); etc. English is decimal, and this is an inherited Indo-European feature. 
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Yet we also find in English a modicum of vigesimality. There existed, in Great 
Britain until 1971, the peculiarity that there were 20 shillings to the pound. There 
even exists a word for ‘20 of its kind’, score, borrowed from Scandinavian. Many 
speakers know it as the second word of President Abraham Lincoln’s Gettysburg 
address of 19 November 1863, which begins with the sentence, “Four score and 
seven years ago our fathers brought forth on this continent a new nation, con- 
ceived in Liberty, and dedicated to the proposition that all men are created equal.” 
The reference is to the year of the Declaration of Independence of the United States 
of America from the British Empire on 4 July 1776. Thus Lincoln was looking 
back 87 years, and this is what four score and seven years means. Christians may 
remember psalm 90, verse 10: “The days of our years are threescore years and 
ten; and if by reason of strength they be fourscore years, yet is their strength labor 
and sorrow.” 

In French one counts decimally up to 60: dix, vingt, trente, quarante, cinquante, 
soixante. Then, strangely, 70 is not septante° but soixante-dix, i.e. 60 [and] 10, and, 
even more strangely, 80 is quatre-vingt, 4 [times] 20, and 90, quatre-vingt-dix, 4 [times] 
20 [and] 10. 

Nonetheless these are but individual vigesimal concepts and numerals. We 
have to look beyond English and Modern French to see veritable vigesimal 
number systems in Northwestern Europe.’ 


1.1 Vigesimality in Germanic (Danish) 


The only Germanic language with a vigesimal counting system is Danish. The 
following cardinal and ordinal numbers are taken from Bredsdorff (1970: 74-7): 


(1) 10 fi 10th — tiende 
20 tyve 20th — tyvende 
30 tredive 30th —_tredivte 
40 fyrre, fyrretyve 40th —fyrretyvende 
50 halvtreds, halvtredsindstyve 50th —_halvtredsindstyvende 
60 tres, tresindstyve 60th — tresindstyvende 
70 halofjerds, halufjerdsindstyve 70th —_halofjerdsindstyvende 
80 firs, firsindstyve 80th _firsindstyvende 
90 halvfems, halvfemsindstyve 90th —halvfemsindstyvende 
100 hundrede 100th =hundrede 


100 hundrede: Old Danish also: femsynnetyffwe, femsindetiuge, 
femsynnomtivffwe, i.e. ‘five times 20’ 


The way these numbers are constructed is not immediately clear. Therefore I cite 
Bredsdorff’s explanations: 


The names of the Dutch numerals from 20 to 90 are a queer mixture of the 10-system 
and the old 20-system... 
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20 tyve is by origin the plural of ti. The meaning of the singular is still seen in tre- 
dive, i.e. ‘three tens’, and fyrre, i.e. ‘four tens’, occasionally pronounced “fyrretyve” 
[foerraty:va]. Here the ending has in time been confused with the word “tyve.” 

From 50 the real 20-system begins. The numeral “halvtreds” is an abbreviated form 
of “halvtredsindstyve,” which literally means “half third (i.e. 27/2) times (‘sinde’ . . .) 
twenty.” Similarly “tres(indstyve)” has the original meaning of ‘three times twenty’; 
“halvfjerds(indstyve)” means ‘half fourth (i.e. 3/2) times twenty’; “firs(indstyve)” means 
‘four times twenty’; and “halvfems(indstyve)” means ‘half fifth (ie. 4’) times 
twenty’. 

The short forms of the cardinals given in the list above (“fyrre,” “halvtreds,” etc.) 
are generally used, but it is possible (e.g. for special emphasis) to use the long forms 
(“fyrretyve,” “halvtredsindstyve,” etc.). The ordinals are all formed on the basis of 
the long forms (“fyrretyvende,” “halvtredsindstyvende,” etc.). (Bredsdorff 1970: 77) 


Ross and Berns write about this way of counting: 


The formation of Mod. Danish tresindstyve ‘three times twenty’ and the like is fairly 
straightforward; it is the raison d’étre that is difficult. The vigesimal forms develop 
in Old Danish® and are dominant in Modern Danish. In the relevant formations the 
second element is, naturally, always ‘twenty’. (Ross & Berns 1992: 612) 


The formulation “The vigesimal system developed in Old Danish” is not quite 
accurate because the system exists in Danish from the very beginning of its docu- 
mentation. As a matter of fact, “In Old Danish, vigesimal counting procedes beyond 
the decades to the early hundreds” (Eliasson 2006: 102). The largest vigesimal 
numeral cited by Eliasson (2006: 103) is 340, setthensindstyve (i.e. ‘17 times 20’). 


1.2 Vigesimality in Celtic 


Insular Celtic vigesimality only developed in historical times. Old Irish” — at least 
written Old Irish — still shows the inherited Indo-European decimality (Greene 
1992: 511): 


(2) 10 deich 30 tricho 50 coico 70 sechtmogo 90 ndécho 
20 fiche 40 cethorcho 60 sesco 80 ochtmogo 100 cét 


However, the vigesimal way of counting may already have been available in the 
spoken language, because it enters the written language in Middle Irish: 


Fiche ‘twenty’ continues its Old Irish form and declension. During this period the 
vigesimal system begins to be normal, although all the decads up to and including 
‘ninety’ are still attested. There are signs that the precision of the Old Irish system 
was breaking down. (Greene 1992: 525) 


Classical Modern Irish shows the triumph of vigesimality: 


Fiche ‘twenty’ plays an increasingly important part in the system . . . All these shifts 
arise from a tension between the literary standard, which tried to preserve the decads, 
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and the spoken language, which had undoubtedly gone over to the vigesimal system 
by this time; no modern Irish dialect has preserved any of the decads above 
‘twenty’, nor is there any trace of them in Scotish Gaelic or Manx. (Greene 1992: 530) 


Greene here draws the typical picture of the rise of substratal features from the 
language of the lower classes to the language of the ruling class and thus to the 
written language. 


1.3 Vigesimality in Romance 


In Modern Standard French only 70, 80, and 90 are named vigesimally. By contrast, 
Old French seems to have been thoroughly vigesimal. Price (1992) writes: 


Vigesimal forms not now found in Standard French occur at earlier periods. Nyrop 
(1960: §490) quotes the following attested Old French forms: 

[30] vint e dis 

[40] deus vins 

[60] trois vins 

[70] trois vins e dis 

[80] quatre vins 

[90] quatre vins e dis 
[120] sis vins 
[140] set vins 
[160] huit vins 
[180] neuf vins 
[220] onze vins 
[240] douze vins 
[280] quatorze vins 
[300] quinze vins 
[320] seize vins 
[340] dis set vins 
[360] dis huit vins 
Of these, six-vingts is well attested in the seventeenth century... The form quinze- 
vingts survives in the name of the hospice de Quinze-Vingts, founded by St Louis in 
1260 as an asylum for three hundred blind people. There is also a rue des Quinze- 
Vingts in Troyes. (Price 1992: 463-4) 


Romance vigesimality is not restricted to French. I simply quote Price’s succinct 
account: 


In Franco-Provengal, Maps 1239 and 1240 of the ALF [Atlas linguistique de la France] 
give forms corresponding to trois-vingts ([tre vé] etc.) trois-vingt-dix for some points 
in Savoie, with the comment that this usage is “vieilli.” With reference to the patois 
of Bagnes, one of the best preserved Franco-Provengal dialects in Switzerland, we 
are told (Bjerrome 1957: 68) that some vigesimal forms were maintained until 
recently, “pour indiquer le nombre de vaches d’un alpage,” e.g. wi vé vatse (= huit 
vingts vaches), sa vé vatse e demyi (= sept vingts vaches et demi, i.e. ‘one hundred and 
fifty cows’)... 
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Occitan generally retains a decimal system. However Palay (1961) gives trés-bints 
(= trois-vingts) as well as chichante for ‘sixty’ and, under cén(t), comments: “On 
emploie souvent, au lieu de cén, le comp. cing bints. Various other Gallo-Romance 
forms (including deux-vingts in Haute-Marne, and a parallel form diivé in Savoie 
(ALF Map 1110, Point 965) are quoted in von Wartburg (1922-, 14: 443-444)... 

Outside the Gallo-Romance area elements of a vigesimal system are well estab- 
lished in Southern Italy (see Rohlfs 1966-1969, §975—976), particularly in Sicily (du 
vintini ‘forty’, du vintini e ddeéci ‘fifty’, etc., up to cincu vintini ‘hundred’), but also in 
parts of the southern mainland, e.g. dua/tri/quattro vintini in various Calabrian 
dialects and parallel forms as far north as the Abruzzi. In some cases the system is 
used for numerals up to ‘fifteen times twenty’ = ‘three hundred’, e.g. quinnici vintini 
(Cosenza) and diecentine ‘two hundred’ and quindice intine ‘three hundred’ at 
Vernole. These forms are of course based on collectives (corresponding to Fr[ench] 
vingtaine), but forms corresponding exactly to the French type are found in Salentine 
dialects (quattro vinti). Widespread though the vigesimal system is in southern Italy, 
the decimal system coexists with it, and the use of the vigesimal system is restricted 
to specific functions, e.g. for stating a person’s age or for counting eggs, fruit, etc. 
Sporadic forms occur in Ibero-Romance. (Price 1992: 464-6) 


The Latin way of counting was purely decimal. Therefore the existence of viges- 
imality in wide parts of the Romance world is in want of explanation. What is 
also in want of explanation is that vigesimality is a feature only of western Romance; 
for Greene (1992: 463) writes, “Rumanian stands alone in having entirely aban- 
doned the Latin forms for the decads. However, although the Rumanian forms 
do not reflect their Latin equivalents, the decimal system itself remains intact.” 
The passages I have highlighted by boldface in the above quotations from Price 
(1992) show a close connection of counting by 20 to elemental aspects of life. Non- 
inherited linguistic features of this sort are typical substratal residues. 


1.4 On the origin of western Indo-European vigesimality 


There is no generally accepted account of the origin of vigesimality in the western 
Indo-European languages. As may be expected, there are two sorts of theories, 
one ascribing it to spontaneous indigenous innovation, the other, to contact with 
other languages. 

Since I find it hard to believe that speakers with a decimal system spontaneously 
switch to a vigesimal one, in my view the only acceptable sort of explanation is 
one based on language contact. Price (1992: 469) offers Celtic, Norman, and Basque 
as possible giving languages. Of these, Celtic is most often mentioned, probably 
because most authors are impressed by the well-known partial vigesimality of 
French and think of this language as having developed on a Celtic (Gaulish) 
substratum. Tagliavini (1998), for instance, in his chapter on the Celtic substrate, 
mentions vigesimality in his main text only with regard to French:"° 


Keltisch ist vielleicht das System der Vigesimalzahlung, von dem das Franzoésische 
in quatre-vingt(s) einen Rest bewahrt hat (das Altfranzdsische kannte noch treis-vinz 
‘sechzig’, sis-vinz ‘hundertzwanzig’). Gleichwohl gibt es Beispiele dafiir auch in anderen 
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Sprachen, und andererseits existieren Spuren des Typs huitante, nonante in franz6- 
sischen Dialekten. (Tagliavini 1998: 101) 


[Celtic is probably the vigesimal counting system which is the source of the French 
remnant which can be recognised in quatre-vingt(s) (in Old French there was treis- 
vinz ‘sixty’ and sis-vinz ‘a hundred and twenty’ as well). Furthermore, there are 
examples in other languages and there are also traces of the huitante, nonante type 
in French dialects. (Tagliavini 1998: 101).] 


In my view Celtic should not even be mentioned as a possible source for western 
Indo-European vigesimality because (a) Celtic is Indo-European and therefore 
originally decimal; (b) historical Insular Celtic started out decimal, becoming viges- 
imal only during the Middle Ages; and therefore, (c) Gaulish, lying chronologic- 
ally between Proto-Celtic and Insular Celtic, must be assumed to have been decimal, 
a conclusion supported by the fact that “the only known relevant Gaulish form, 
tricontis ‘thirty’, fits clearly into the decimal system” (Price 1992: 466). 

Holding Norman, i.e. the Vikings and, in the final analysis, Danish responsible 
for western Indo-European vigesimality’' is equally unexplanatory because Proto- 
Germanic, as a branch of Indo-European, was decimal, and so was Proto-Norse. 
Therefore, even if it is admitted that the Normans had some role in the spread 
or consolidation of vigesimality, e.g. in southern Italy and Sicily, Danish vigesi- 
mality is not the explanans but is itself an explanandum. 

That leaves us with Basque. And indeed, since vigesimality in decimal Indo- 
European cannot be explained with an Indo-European, i.e., by default, decimal 
substratum, non-Indo-European Basque should be considered first rather than last 
as a possible substratal source of western Indo-European vigesimality.’* Basque 
is vigesimal, and even though Basque texts of any length are not older than the 
sixteenth century CE there is no indication that it ever was anything but vigesimal. 


(3) Standard Basque cardinal numbers (Trask 2003: 127; King 1994: 414) 


1 bat 11 hamaika 10 hamar 

2 bi, biga~ bi 12 hamabi 20 hogei, hogoi' 

3 hiru, hirur 13 hamahiru, -r 30 hogeitahamar"* 

4 lau, laur 14 hamalau, -r 40 berrogei 

5 bost, bortz 15 hamabost, -bortz 50 berrogeitahamar 
6 sei 16 hamasei 60 hirurogei 

7 zazpt 17 hamazazpi 70 hirurogeitahamar 
8 zortzi 18 hamazortzi 80 laurogei 

9 bederatzi 19 hemeretzi 90 laurogeitahamar’? 
10 hamar 20 hogei 100 ehun 


21 hogeitabat, 22 hogeitabi; 105 ehun eta bost; 1000 mila 


1979 mila  bederatziehun hirurogeita hemeretzi 
1000 900 60-and 19 


As one can see, this particular historical system shows some adjustment to the 
Latin and modern Romance decimal way of counting. In particular the special 
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role of the numeral 100 is non-vigesimal in spirit. As the Old French and Old Danish 
systems show, genuine vigesimal counting goes beyond 100, 100 itself simply being 
5 times 20. Regional vigesimal variants of numerals beyond and including 100 
suggest that in Basque too this limit is a modern innovation and that in older 
Basque vigesimal counting continued without limit. Eliasson (2006: 103) cites the 
following examples from Lafitte (2001: 77) and DGV, 4.363, 5.541: 


(4) 100 bortzetan hogoi 5 x 20 
120 seietan hogoi 6 x 20 
160 zortzetan hogot 8 x 20 
180  bederatzitan hogoi 9 x 20 


Assuming a Basque substratum to be responsible for western Indo-European viges- 
imality was problematic in the past because the Basque territory was viewed as 
too small to account for the wide spread of vigesimality in Europe. Naturally, 
Basque vigesimality could be considered an import from Romance. But that this 
was untenable was recognized even in the absence of an alternative explanation: 


Entwistle (1936: 18) suggests that the Basque vigesimal system “may be of Celtic 
provenance,” but there is no evidence to support this and, if Gaulish in fact had no 
vigesimal system and the vigesimal system of Welsh, Irish, etc. is a comparatively 
recent innovation, then of course Entwistle’s suggestion lacks any foundation at all. 
(Price 1992: 490, n. 30) 


However, the role of Basque has changed completely in the Vasconic theory of 
prehistoric Europe: Since almost all of western, central, and northern Europe is 
assumed to have been Basque in this theory, Basque structural patterns may be 
expected to be found everywhere in this area — that is, the source of these 
features is Basque.'° 

It must be stressed that vigesimality is merely a set of structural patterns and its 
importation into another language is not tied to the names of the basic numerals. 
Uncontrolled foreign language learning, as in language shifting, is initially the 
learning of foreign words and putting them in the patterns of the native language: 


(5) Latin substance and Vasconic form in French vigesimal counting 


‘A’ ‘20’ 
Substance: Latin quattuor viginti 

1 J 
Form: Basque ‘80’ laur- -0gei 

1 J 
Result: French ‘80’ quatre - — vingt 


Only continued intensive learning may lead to greater approximation even to the 
structural targets of the new language. 
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2 Two Copulas 


Proto-Germanic, as indeed Proto-Indo-European, had only a single copula. This 
may not strike anyone as peculiar, because the same is true for Contemporary 
English as well as the English of Shakespeare and of Chaucer. It is also true for 
historical North and East Germanic, as can be seen in the Old Norse and Gothic 
present indicative paradigms translating the forms of English be: 


(6) Old Icelandic Gothic 


em im ‘(D) am’ 

es(t) is ‘(thou) art’ 

es ist ‘(he/she /it) is’ 
erom sijum —‘(we) are’ 

erod sijup (you) are’ 

ero sind “(they) are’ 


But this is not true for Old English. 


2.1 Two copulas in Old English and Celtic 


All Old English dialects had, from the time of their earliest attestation, two 
copulas, each with a complete present indicative paradigm: 


(7) Old English, West Saxon 
s-paradigm” —b-paradigm” 


eom béo ‘() am’ 

eart bist (thou) art’ 

is bip ‘(he/she /it) is’ 
sind(on) beop ‘(we/you/they) are’ 


Of these, the s- or eom-paradigm is recognizably a formal continuation of the 
Germanic paradigm” also reflected in Old Norse and Gothic, while the b- or 
béo-paradigm is an innovation. As to the meaning of these two copulas in Old 
English, Campbell writes: 


béo expresses what is (a) an invariable fact, e.g. ne bid swylé cwenlic peaw [Beowulf 
1940] ‘such is not a queenly custom’, or (b) the future, e.g. ne bid be wilna gad [Beowulf 
660] ‘you will have no lack of pleasures’, or (c) iterative extension into the future, 
e.g. bid storma gehwylé aswefed [Phoenix 185-6] ‘every storm is always allayed’ (i.e. 
on all occasions of the flight of the Phoenix, past and to come); eom expresses a pre- 
sent state provided its continuance is not especially regarded, e.g. wlitig is se wong 
[Phoenix 7] ‘the plain is beautiful’. (Campbell 1959: 350)” 


This un-Germanic twofold paradigm for the copula was explained as a contact 
phenomenon as early as 1925, when Keller pointed to the formal similarity of the 
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beo-Paradigm with the b-paradigm of the Celtic languages, for which see Pedersen 
(1976: II. §§636—41), especially the tables in sections 637 and 645, and Lewis and 
Pedersen (1989: §§476—86), especially the tables in sections 477 and 485. Whereas 
the Celtic s-paradigm rarely shows its s anymore, owing to intense phonological 
change, the b-paradigm is clearly recognizable as such in all Insular Celtic 
languages. As to the origin and meaning of the Celtic paradigms, Lewis and 
Pedersen (1989) state: 


The paradigm of the verb ‘to be’ consists in Italo-Celtic of forms of the roots *es- and 
*bheu-. In Celtic a pres. stem *bhwi-, *bhwije-, derived from *bheu-, also appears. This 
latter present denotes either a praesens consuetudinale or a future, a natural develop- 
ment from an orig[inal] meaning ‘to become’ (Lat. fio). The same root is also used 
in the subjunctive. The root *es- stands only in the pres. and ipf. ind. in Celtic; in 
Ir[ish] it is not found in the ipf. (Lewis & Pedersen 1989: §476.1) 


The paradigms added to the inherited s-paradigm in Old English and in Middle 
Welsh are remarkably similar both as to form and to meaning: All forms in the 
paradigm of both languages begin with a b- followed by a front vowel; and the 
meanings formulated by the specialists — “(a) an invariable fact . . . or (b) the future 
... Or (c) iterative extension into the future” in Old English and “a praesens con- 
suetudinale or a future” for Celtic — are close enough to invite the idea that the 
innovations did not arise independently. Keller (1925: 59) too emphasized both 
facts and said specifically that the functional agreement was especially remark- 
able because it implied “a greater similarity of thinking between Anglo-Saxons 
and Britons than between Anglo-Saxons and Frisians or Germans.”” Indeed, for 
whereas both Old English on one hand and Frisian and Old Low and High German 
on the other differ from North and East Germanic showing b-forms for the 
expression of ‘to be’, only Old English has developed them into a separate 
second present tense paradigm, while the other West Germanic languages com- 
bine b-forms with s-forms in a single paradigm, without any evidence for a split 
as in Old English. 


(8) Frisian Old Saxon Old High German 


bim bium bim 
bist bist 
is is ist 
sind sind(um) birum 
birut 
sind 


Schumacher (2007) sees the b-forms in Frisian and German as evidence for a 
separate, contact-induced prehistoric West Germanic b-paradigm which was 
conflated with the inherited s-paradigm before the earliest Frisian and German 
documents but preserved in Anglo-Saxon through the continued contact with Celtic. 
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This view is considered unlikely by Lutz (2009): Contact-induced grammatical 
categories do not result from borrowing but through language-shifting. 
Schumacher’s view presupposes that the West Germanic peoples are for the 
most part Celts who learned Germanic. There is no independent evidence for this 
to be true — except in the case of the English. Nevertheless the Frisian and 
German b-forms are important: They show that not only Insular Celtic but also 
Continental Celtic had the b-forms, and by implication that the Celtic separate 
b-paradigm originated on the Continent and was taken from there to the Isles.” 

At this point an important question arises: If we explain with Keller the fact 
that “a greater similarity of thinking [exists] between Anglo-Saxons and Britons 
than between Anglo-Saxons and Frisians or Germans” came about because large 
numbers of Britons shifted to Anglo-Saxon in the centuries after the Conquest, 
carrying their way of expressing themselves — their “thinking” — into the target 
language, then how do we explain that the Celts themselves differ from the rest 
of the Indo-Europeans by this special way of thinking? How did they acquire this 
un-Indo-European manner of speaking? 

The simplest answer would be one beginning with the words, “In the same way.” 
It would also be the best answer, because it would not require additional or 
different theoretical assumptions. The double paradigm would then simply 
exemplify the model which I named “the transitivity of language contact” 
(Vennemann 2002). It suggests that for Celtic we look into the substratum of Celtic 
in Central Europe, just as for Old English Keller looked into the substratum of 
Anglo-Saxon in Britannia. So let us look into Basque and see if that language too 
has two paradigms for ‘to be’, two copulas. 


2.2. Two copulas in Basque 


Trask (1997: 113) emphasizes the difference between two meanings, eventive 
and stative, in connection with two verbs ‘to be’: “Note the difference between 
moxkortu naiz ‘I got drunk’ (a little while ago) (eventive, with izan) and moxkor- 
tuta nago ‘I’m drunk’ (stative, with egon).” For these two verbs corresponding to 
English ‘to be’, izan and egon, one eventive, the other stative, de Azkue (1984, s.vv.) 
provides, besides French étre for both, the following Spanish equivalents, each as 
the first (1°) of several translations, where “(c)” indicates that the item is “comun 
... 4 toda la lengua,” rather than being restricted to one or several of the dialects 
or individual communities: 


(9) IZAN...1° (c) ‘ser’ 
EGON 1° (c) ‘estar’ 


While both verbs and the semantic distinction thus are comtin, several authors men- 
tion that the distinction is more common in the west or in the south of the Basque 
dialect area, as in the following instances. 
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Copula. All varieties have a copula, used both in equational sentences and in pre- 
dications of class membership. Southern varieties distinguish two copulas, the second 
being roughly confined to expressing temporary states or conditions and locations 
(temporary or permanent). (Trask 1997: 119) 


Izan is the verb to be: Irakaslea naiz I am a teacher. In the third person it can also 
express there is/are: Bada eliza bat There is a church. Izan is also used as an auxiliary. 
It is intransitive ... Egon is another verb meaning to be, and also stay or wait: Isilik 
dago He is silent; Zaude hemen! Wait here! It is characteristic of western dialects and 
is intransitive. (King 1994: 362) 


Agud and Tovar (1993, s.vv.) even seem to restrict the occurrence of the second 
of these two words to the westernmost dialect: 


(10) IZAN ‘ser’, ‘estar’, ‘haber’, ‘tener’, ‘soler’ 
EGON, IGON V[izcaino] ‘estar, quedar’ 


But the phenomenon is given much space in grammars of Basque. Let me cite 
just two recent accounts. Sagiiés (1994: 45-6) writes about izan: 


El verbo izan desempefia en euskara un doble papel: 
— por un lado corresponde al verbo «ser» castellano. 


P.e. ni zaharra naiz : yo soy viejo 
zu gaztea zara : tu/ud. eres/es joven 
— y por otro lado acta como verbo auxiliar. 
P.e. ni etorri naiz : yo he venido 
gu etorri gara : nosotros hemos venido™ 


Etxepare (2003) begins the section about “Copular constructions” with the following 
description and examples: 


Basque makes a distinction between stage-level predications (those which attribute 
some transitory property to the subject of predication) and individual-level predica- 
tions (those which attribute some standing property to the subject of predication) in 
the auxiliary selected to express them. Transient properties are assigned by the verb 
egon ‘be in a location’, whereas standing properties are assigned through the verb 
izan ‘be’. The distinction, which is for the most part limited to western dialects, is 
reminiscent of the one found in Spanish between ser and estar... Izan is also used 
in equative sentences. (Etxepare 2003: 365) 


Since it is not known how old this way of “thinking” in terms of two copular 
verbs is in Basque, whose oldest extensive documentation did not begin before 
the sixteenth century, there is no way of proving that the Vasconic substrate of 
Celtic on the Continent did have this feature. There is, however, indirect evidence 
for the assumption that it goes back to prehistoric times. 

Where we find late prehistoric or early historical — let us say, recent — devel- 
opments of a second copula, as in Celtic and in Northwestern Romance, its 
origin in a verb with the meaning ‘to become’ (*bheu-, Celtic) or ‘to stand’ (Latin 
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stare, Romance) is transparent. Similarly in Basque, where an original meaning 
‘to stay’ or ‘to wait’ can be recognized for egon. But beyond that, Agud and Tovar 
(1991, s.v.) write about egon: “Es otra de las palabras vascas importantes™ cuya 
etimologia no esta en modo alguno resuelta.” 

Another indirect argument follows from the way the development of a second 
copula — from Latin stare ‘to stand’ alongside the continuation of inherited esse 
‘to be’ — is distributed in the Romance languages. 


2.3. Two copulas in Northwestern Romance 


The two copular paradigms based on Latin esse and stare exist in Portuguese, 
Galician, Castilian, and Catalan as well as, to a lesser extent, in Italian and, even 
lesser, in Sicilian, not at all in Rumanian. They once existed in French: 


Modern French has only one copula. Old French, however, had two, estre (ESSE > 
essere > *essre > estre) and ester (STARE > “estare > estar > ester), and distinguished between 
them in a similar way to other Romance languages. With phonetic evolution, the 
forms of each verb tended to be confused with one another, with the result that estre 
finally absorbed ester; around the same time, most words beginning with est- 
changed to ét- or ét-. The modern form of the verb is étre. 

The only clear trace of ester (or éter if we bear in mind the loss of the s) in the 
modern copula is the past participle: instead of the *étu one would expect, we 
find été. 


The areas south of the Pyrenees, as well as north of the Pyrenees and the Alps, 
plus to a lesser extent northern Italy, are precisely the regions assumed to have 
been Vasconic before the Indo-Europeanization of Europe.” The null hypothesis 
accounting for this distribution of the rise of a second copular paradigm — in north- 
western Romance and in Continental Celtic — is, on the basis of the theory of a 
once-Vasconic Western and Central Europe, that this particular way of syntactic 
“thinking” in terms of two copulas was carried from the Vasconic substrate into 
those Indo-European superstrates in the process of langugage shifting. 

An argument against the above contact interpretation could be the fact that 
in Contemporary Basque the use of the two copulas is primarily a feature of the 
western dialects, so that superstratal transfer from Spanish to Basque should be 
assumed (cf. Trask 1997: 292-3). The correct interpretation of this distribution in 
the dialects is, on the contrary, that the extended superstratal contact with 
French, which reverted to a one-copula syntax centuries ago, led to diminished 
use of the second copula egon in the eastern dialects of Basque. 

Needless to say there have been attempts to explain the rise of a second copula 
in Romance, especially in Spanish, in terms of the internal dynamics of change.” 
For example, Posner (1996: 313) suspects three language-internal “factors” as 
having contributed to this development. She does not seem to notice, however, 
that the explanantia may themselves be in need of explanation. Thus, that 
“another factor that may have played a part is early loss in the Iberian languages 
of full lexical meaning of estar” is actually part of the explanandum rather than 


394. Theo Vennemann 


an explanans. Posner does not consider the possibility of external influence. Pfaller 
(2003) uses the model of “expressive change” of Koch and Oesterreicher (1996) 
and Detges (2001) to develop the view that the expressive use of the position verbs 
sedére, stare, and iacére gave rise to a semantic change which, as a consequence of 
increasingly “normal” use, ran from the positional Grundbedeutung ‘sit’, ‘stand’, and 
‘lie’ via merely locational semantics toward ‘be’.** She does not ask the question 
why the same kind of change does not affect the corresponding position verbs in 
German or Russian in same or similar ways, nor does she consider the possibility 
of external influence. 


2.4 Two copulas in Irish English 


Finally, one may ask if “thinking in terms of two copulas” is at all transferable 
from one language to another, namely from a substratum to its superstratum. That 
is, one would like to see a bona fide case from a recent or present-day contact 
situation. Fortunately, there is such a case. 

The two-copula system of Old English was simplified in Early Middle English.” 
What has remained to the present day is a mixed paradigm (am, are etc., to be, 
been) combining forms from the old s- and b-paradigms in a new paradigm with 
a uniform copular meaning. This is the situation in the standard language of 
England. Not so in Irish English. There we see again the rise of a second 
copula with an habitual meaning, do be (+ -ing-form of the main verb), cf. Hickey 
(2007: 141): 


(11) _ Irish: Bionn sé ag caint léi. 
[is- HABITUAL” he at talking with-her] 
Irish English: He does be talking to her. 
‘He talks to her repeatedly.’ 


See also Hickey (2007: 173) and the list of examples on pp. 216-17, which 
includes sentences without -ing-forms: 


(12) Ido be up ina heap... I don’t be able to get out at all. 
All the dances we have now are céili dances. They do be very good. 


The formal side of this situation is not completely uniform. There are alternative 
ways of expressing the habitual-copular meaning, with different regional distri- 
butions or preferences. The chief alternative is invariable be (+ -ing-form of the 
main verb), e.g. I be there every day (Hickey 2007: 183). Another is inflected be, e.g. 
The week-day be’s a quiet day (Hickey 2007: 231). Yet another is unstressed do (used 
also in the past, did) + infinitive of the main verb (Hickey 2007: 216n): 


(13) He does 'fish. 
That’s what we did ‘say to each other. 
I did ‘never hear of that at all. 
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This multiple formal representation of the habitual is not damaging to the idea 
of the transfer of “thinking in terms of two copulas” from substratal Irish to super- 
stratal English in Ireland. Hickey explains this situation as a natural consequence 
of language shifting: When using English the Irish, with two copulas in their native 
language, search for a way of expressing the meaning of the second copula, the 
b-copula, in English, which does not offer an equivalent, and hit upon different 
solutions, which become different ways of regularly expressing habituality in devel- 
oping Irish English, even for those speakers who are monolingual in Irish 
English. Hickey formulates this succinctly in terms of category and exponence: 


With the habitual, it is particularly important to distinguish between category and 
exponence. The existence of the category habitual in Irish would have triggered a 
search for categorial equivalence among the Irish speakers during language shift. 
The exponence of the habitual in Irish does not have anything like a formal match 
in English, neither in its standard form and nor in the habitual structures which have 
arisen in Irish English historically. (Hickey 2007: 214n.) 


Hickey discusses the use and the history of the expression of habituality in Irish 
English at great length (2007: 213-37). He leaves no room for doubt that it is a 
genuine innovation owed to the Irish substrate rather than, as has also been claimed, 
a continuation of the Old English double paradigm through Scots (2007: 227-31). 


3. Accent 


Proto-Indo-European had a variable accent. In three and only three of the Indo- 
European language subfamilies was this way of accentuation given up in favor 
of a fixed initial accent, namely a first-syllable accent: in Germanic, Italic, and Celtic: 


Germanic: “Den tiberkommenen idg. Akzent hat das Germanische grundlegend 

verandert: es hat die Méglichkeit des ‘freien’ Akzentes vollig aufgegeben und 
ihn festgelegt auf die jeweils erste Silbe eines Wortes (Anfangsbetonung oder 
Initialakzent) ... Aus dem starken Verfall der Endungen ergibt sich gleichzeitig, 
daf die Natur des germ. Wortakzentes eine vorwiegend dynamische gewesen 
sein mufs” (Krahe & Meid 1969: vol. 1, §27). 
[Germanic: “The traditional Indo-European accent underwent a major change 
in Germanic. The “free accent” was abandoned and accent was fixed on the 
first syllable of a word (initial accent) ... As a result of the considerable reduc- 
tion of inflections the nature of the Germanic word accent must have also been 
mainly dynamic’ (Krahe & Meid 1969, vol. 1. §27).] 

Italic (Latin): “There is little disagreement that the prehistoric accent of Latin was 
a stress accent, and that this fell on the first syllable of the word. Its effects are 
seen in the loss or weakening of vowels in the unaccented syllables, which is 
typical of strong stress in some other languages” (Allen 1970: 83). 


396 Theo Vennemann 


Celtic (Irish): Initial accent cannot be doubted in the case of Old Irish: “Words [in 
Old Irish] susceptible of full stress take this on the first syllable... The stress 
is expiratory and very intense, as may be seen from the reduction of unstressed 
syllables” (Thurneysen 1946: §36). 


The Germanic first-syllable accent is evident in all early Germanic languages, 
and in their metrical systems based on alliteration; it must therefore be assumed 
to date back to Proto-Germanic. The situation is less straightforward for Italic and 
Celtic. “Latin is the only Italic language for which we have any information” (Sihler 
1995: §246). For Continental Celtic there exists no direct evidence either. Gaulish 
place names show final accent (falling on one of the last three syllables) in their 
later phonological development. Brittonic has a final accent, which fell on the ultima, 
the penult, or the antepenult in the various sub-branches at different stages of 
development (Pedersen 1976: vol. 1, §180). But Schrijver (1995: 19-20) presents 
evidence that the predecessor of British had initial stress. “This strongly suggests 
that the system of initial stress that we find in Irish goes back at least to PInsCl 
[Proto-InsularCeltic] times.” 

Celtologists are divided between the view that Proto-Celtic accentuation is 
unknown and the view that a Proto-Celtic first-syllable accent may be inferred. 
I side with the latter because the attested diversity can be best explained as 
deviations either caused by a later Romanization (the case of Gaulish toponyms) 
or by developments toward the universally preferred final accentuation, with 
final accent — ultimate or penult — as the commonest pattern (cf. Hyman 1977). 
Furthermore, Mercado (2007) has made a case that similarities between Italic and 
Old Irish meters are owed to common inheritance, which suggests that the 
accentuation in the two branches was the same at the time when those meters 
originated: “Cisalpine Celtic and Old Irish . . . point to a Proto-Celtic trochaic-dactylic 
colon most likely cognate with that of Proto-Italic,” this colon forming the basis 
of “the prehistoric accentual meters of initial-stressing Italic and Celtic” 
(Mercado 2007: Abstract).*" 

Clearly the assumption that not only Proto-Germanic but also Proto-Italic and 
Proto-Celtic had first-syllable accent yields the simplest overall account of the 
historical phonology and metrics of Germanic, Italic, and Celtic. These three were 
neighboring languages in prehistoric times, as impressively demonstrated by Krahe,” 
spoken most likely in a coherent central European area between southern 
Scandinavia and the Alps. The shift to initial accent is therefore best accounted 
for as a sprachbund phenomenon.” Since there is no language-internal explana- 
tion for the development of intial accent in this sprachbund, language contact with 
non-Indo-European languages has been held responsible for nearly a hundred 
years (since Feist 1913: 375).** Salmons (1992) shows that prosodic properties are 
transferred with special ease from substrata to their superstrata.” I have given 
arguments in Vennemann (1994: §7.6) for the assumption that the substrate of this 
Indo-European initial-accent sprachbund, the Old European language reflected 
in the Old European toponymy, had initial word accent, and in the same and 
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several other articles, that this language was Vasconic, which is also the position 
taken in the present chapter. 

It would be welcome support for this explanation of the west Indo-European 
initial accentuation if it could be proved that the surviving member of the 
prehistoric Vasconic language family, Basque, had initial word accent at a 
sufficiently early stage of development. Since there is no attestation of very 
early Basque, the evidence can only be indirect. Following Martinet (1964) and 
Michelena (1977), I adopted this view in Vennemann (1994). However, Hualde 
has argued in a number of articles, most recently in Hualde (2003 and 2007), 
that all accentual systems observable in Basque (for which see Hualde 1999; 2003), 
including the initial accent systems reconstructed by Martinet und Michelena, 
have developed from an earlier stage at which the language lacked word accent 
and only possessed phrase-level prosody (Hualde 2007), a kind of system known 
from Modern French, though with different contours. This result does not, of course, 
preclude the existence of a word-accentual system with initial accent at a still 
earlier stage. The fact that consonant clusters occur almost exclusively between 
the first and second syllable points to first-syllable accent. 

In concluding this section I would like to mention that Etruscan, a non-Indo- 
European, non-Vasconic language of northern and central Italy, was subject to 
first-syllable accent too. This is shown by the reductions, especially syncopations, 
observable over several centuries: These affected especially the second syllables 
of words, but also later syllables; they never affected the first syllables of words 
(Bonfante 1990: 335). Etruscan, the language of intruders from the Aegean area, 
had spread over a Vasconic substratum exactly as the western Indo-European 
languages,” and the substratum speakers likewise took their native initial accent 
into the intruding language they had to learn. 


4 Conclusion 


The occupation with language contact is not a recent phenomenon. Yet many 
linguists have shunned away from contact explanations in historical linguistics, 
and in some circles of Indo-European studies references to substratal influence 
amount to academic suicide. As late as 1998 I could assess the situation as 
follows. 


Gewif hat sich die Indogermanistik auch mit den nicht-indogermanischen Ziigen der 
indogermanischen Sprachen befa&t, aber doch langst nicht mit derselben Intensitat 
wie mit den deutlich indogermanischen. Auffallig ist auch, da die Befassung mit 
den nicht-indogermanischen Ziigen als wenig ehrenhaft zu gelten scheint. Man 
spricht, nicht selten abfallig, von der “Substrattheorie,” und wenn ich es richtig hére, 
so meint die Verwendung des Ausdrucks “Theorie” in diesem Zusammenhang soviel 
wie “nur eine Theorie” — so als ob nicht alles in den empirischen Wissenschaften 
nur eine Theorie ware und als ob nicht umgekehrt die stratale Beeinflussung aller 
Sprachen zu den beinharten Ergebnissen der Sprachkontaktforschung gehorte. 
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Gerade bei den west-indogermanischen Sprachen, bei denen die nicht-indoger- 
manischen Kontaktsprachen (im Gegensatz etwa zu den Drawida-Sprachen als 
Kontaktsprachen des Indischen) nicht ohne weiteres zu erkennen sind, kann von 
einer ernsthaften Bemtihung der Indogermanistik um eine ErschlieSung der vor- 
indogermanischen Substrate und weiterer vor-indogermanischer Kontaktsprachen 
kaum die Rede sein. Ich meine, dafs die andere Abteilung der Indogermanistik, in 
deren Skopus diese Probleme gehoren, die gleiche Anstrengung der Indogermanisten 
verdient wie die ohnehin wesentlich weiter ausgebaute erste. (Vennemann 1998: 
130-1) 


[True, Indo-European studies has concerned itself with the non-Indo-European 
elements in Indo-European languages but by no means with the same intensity as 
with the clearly Indo-European elements. It is also remarkable that the investigation 
of non-Indo-European elements is not held in such high esteem. Not infrequently a 
dismissive tone is found in discussions of “substrate theory” and, if I have observed 
this correctly, the use of the word “theory” in this context means “just a theory” — 
as if everything in empirical science was not just a theory anyway. It is as if mutual 
influence among languages did not belong to the core insights of language contact 
studies. 


Especially in the case of the western Indo-European languages, where the non-Indo- 
European contact languages cannot be so easily recognized (in contrast, for instance, 
with Dravidian in contact with Indic), one can say that Indo-European scholars are 
not particularly concerned with establishing what the pre-Indo-European substrate 
and contact languages were. I believe that the part of Indo-European studies which 
is concerned with these matters deserves the same attention as does that which focuses 
on inherited features of Indo-European languages. (Vennemann 1998: 130-1)] 


It seems, however, that the increasing endeavor to develop contact linguistics as 
a principled subdiscipline of linguistics, which began more than half a century 
ago (Weinreich 1953) and has picked up enormous momentum in recent years 
(e.g. Thomason & Kaufman 1988; Goebl et al. 1997-8; Mufwene 2001; Thomason 
2001; Winford 2003),” has changed the climate in favor of a more open attitude, 
witness, for example, the conference “Languages in Prehistoric Europe” (Univer- 
sity of Eichstatt, 4-6 October 1999) and the resulting volume (Bammesberger & 
Vennemann 2003) which brought Indo-Europeanists and scholars of other fields 
together as speakers, discussants, and authors and proved, if nothing else, that 
the old reluctance to confront the topic seems not to exist any longer. 

I hope to have shown in the present paper that dealing with the question whether, 
and if so to what extent, languages related to Basque may have substratally 
influenced in prehistoric times the west Indo-European languages north of the 
Great Divide, just as they did south of it, especially south of the Pyrenees (e.g. 
Baldinger 1963;* Garvens 1964), is a serious and promising enterprise which may 
take us a long way toward understanding why those languages show so many 
sprachbund-like commonalities in their grammar, lexicon, and toponymy. 
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NOTES 


nN 


10 
11 


12 


13 
14 
15 


16 


17 


18 


This section is for the most part a condensed English version of Vennemann (2003: 
ch. 17, §17.1.3). 

For a comparative introduction to number systems see the chapter “The number 
sequence” in Menninger (1992). 

From Lat. vigésimus, -a, -um ‘twentieth’. 

From Lat. decimus, -a, -um ‘tenth’. 

This is evident in the descriptions of the number systems of the Indo-European 
languages in Gvozdanovic (1992), where deviations from decimality are always recog- 
nizable as secondary. Cf. also Beekes (1995: 212): “The numerals of PIE [Proto-Indio- 
European] can be reconstructed down to the last detail... The numeral system is based 
on the counting of decimals.” 

The decimal variants septante, octante/huitante, nonante occur regionally; cf. Price (1992: 
461-3 et passim). See also the Internet site “septante, octante, huitante, nonante,” 
www.langue-fr.net/index/s/septante.htm (accessed 12 May 2008). 

A brief survey of vigesimality in Europe may also be found in Menninger (1992: 64-9). 
The Old Danish forms are given in Ross and Berns 1992: 616-19. 

I limit my quotations to Irish. The other Insular Celtic languages show a similar 
picture, according to Greene 1992. 

Only in a bibliographical note is vigesimality in southern Italy mentioned. 

Rohlfs (1971: §§ 96-8), discussing various possibilities between monogenesis and poly- 
genesis, favors the idea of a Norman origin, even though he stresses the occurrence 
of vigesimality in older varieties of Spanish as a complicating factor and mentions 
Basque “im geographischen Ubergang vom Franzésischen zum Spanischen” (p. 132). 
Menninger (1992: 110) tabulates the vigesimal system of Basque but considers the 
Normans the most likely importers of vigesimality into French (1992: 67). 

Meillet (1964: 414) did reckon with a non-Indo-European source of western Indo- 
European vigesimality but remained unspecific: “Comme des traces plus nettes 
encore de systéme vigésimal se retrouvent dans le domaine celtique, on se demande 
si ceci n’est pas da a une survivance d’un usage pré-indo-européen.” 

The eastern forms have (h)ogoi also in the following numerals. 

Here and in the following numerals -tahamar is shortened from eta hamar ‘and ten’. 
More explicit forms occur for 60, 70, 80, and 90; e.g. for 90 lauretan hogei eta hamar ‘four- 
times twenty and ten’. See Eliasson (2006) for an explanation of the termination -etan. 
Basque features generally become more sparse the farther east one looks. For example, 
in Balto-Slavic, which shows much of the Vasconic Old European hydronymy, traces 
of vigesimality are only found in the western languages and are explained as borrowings 
from Germanic and Venetian (Comrie 1992: 722-3 with n. 3). 

I call this paradigm the s-paradigm because it is, with certain irregularities mentioned 
directly, the etymological continuation of the present of the Indo-European copula, 
based on the root *h,es-: sing. *h,és-mi, *h,és-si > *h,ési, *h,és-ti, plur. *h,s-més, *h,s-té, *h,s- 
enti (cf. Sihler 1995: §492). 

I call this paradigm the b-paradigm because it is based on the Indo-European root 
*bheu- and Proto-Indo-European *bh became *b in Germanic (and Celtic, see below). 
Whereas *h,es- inflected as a present/imperfect, *bheu- was aoristic, probably because 
its original meaning was ‘to become’ rather than ‘to be’ (Sihler 1995: §491). 
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19 


20 


21 


22 


23 
24 
25 


26 
27 


28 
29 
30 
31 


32 
33 
34 
35 


36 


37 


38 


With some irregularities, explanations for which are rarely attempted in the handbooks 
(but see Onions 1969: s.v. be): The eo of eom may be analogical after béo, the ea and -t 
of eart after the present perfects (e.g. bu bearft ‘you need’); but the -r- of eart is obscure, 
possibly of very early Scandinavian influence. 

The DOE (s.v. béon) too lists three different types of usage distinctions for the s- and 
the b-paradigm, (a) present vs. future, (b) statal vs. actional, and (c) non-durative vs. 
durative: “The distinction is abandoned in early Middle English, earlier in Northern 
than in Southern and Southwestern texts (cf. MED s.v. ben, OED s.v. be, Jost 1909: 139-40, 
Brunner 1962: 277-9)” (Lutz 2009: 233, n. 19). 

This is Lutz’s translation of Keller’s “eine grofere Ahnlichkeit des Denkens bei 
Angelsachsen und Briten...als bei Angelsachsen und Friesen oder Deutschen.” 
Keller was relying on the semantic study of the Old English double paradigm by Jost 
(1909), which has recently found support in the assessment of the DOE, s.v. béon. 
That individual forms may be borrowed into an existing paradigm in a situation of 
non-substratal contact is shown by Lutz (2009: 237-8) with Old Norse erom ‘(we) are’, 
which entered the Anglian s-paradigm as a new Einheitsplural aron and in the course 
of time ousted the inherited sind(on). 

I omit further explanations of the use of izan. 

Among them izan, cf. Agud and Tovar (1991, s.v.). 

From “Romance copula” cf. http://en.wikipedia.org/wiki/Romance_copula (accessed 
5 May 2008). 

In large parts of Italy there was an intermediate time of Etruscan domination. 

The classical historical account for Spanish is Bouzet (1953). For work within genera- 
tive grammar cf. Batllori (2006) and earlier articles. Recent work on copulas in 
general and in Romance includes Pustet (2003), Maienborn (2005; 2007), and several 
articles in Theoretical Linguistics 31.3 (2005). 

Spanish yacer ‘to lie’ only showed the beginning of this semantic change in the Middle 
Ages but did not complete it. 

Cf. Lutz (2009: 233, n. 19), who refers to MED s.v. ben, OED s.v. be; Jost 1909: 139-40; 
Brunner 1962: 277-9. 

The corresponding non-habitual copula form would be fa. 

One may speculate that the development of fundamental metrical techniques of the 
new initial-accent languages was influenced by the meters of the initial-accent sub- 
strate language. 

E.g. in Krahe (1954: 71-2, 171); Krahe & Meid (1969: vol. 1, §§3-7). 

This is also the basic assumption in Salmons (1992). 

Salmons (1992) too assumes such contact in the case discussed above but, differing 
from the position taken here, with Uralic languages. 

The wisdom of everyday language acknowledges this: Speakers of second languages 
speak it “with an accent.” 

That Etruria was part of the prehistoric Vasconia is shown by its toponyms, e.g. the 
name of the Arno river which Krahe (1964: 46) reckoned among the Old European 
hydronyms and in which I recognize Basque aran ‘valley’ (Vennemann 1999: §3.1.2; 
2006b: 975). 

There is even a separate Internet site “Language contact,” http://en.wikipedia.org /wiki/ 
Language_contact (accessed 16 July 2008). 

Baldinger shows that connections with Basque (Aquitanian) are found as far west as 
Galicia and Portugal. 
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20 Contact and the History 
of Germanic Languages 


PAUL ROBERGE 


1 The Germanic Languages 


The Germanic languages are, as the phrase suggests, a group of languages that 
trace their origin to a common ancestor and constitute a branch of the Indo-European 
language family. Prior to the beginning of the present era, Germanic is presumed 
to have been “a fairly homogenous linguistic and cultural unit” (Prokosch 1939: 26). 
Proto-Germanic (in German Urgermanisch) is the hypothetical “parent” language 
existing at a given point in time: “We assume a single Germanic language, with 
a common core of speakers, on the basis of elements common to all its dialects” 
(Lehmann 2007: Preface). Lehmann (1977: 287), who does not take account of 
possible pre-Germanic linguistic encounters, asserts further that “we... have 
good evidence to conclude that Germanic was little influenced in structure by 
external contacts leading to interference until approximately the middle of the first 
millennium B.C.” After the formation of Proto-Germanic at the turn of the fifth 
century BCE, external influences come to be considerable. 

Over the course of approximately a millennium and a half, the ancestral 
language fragmented into dialects, which ultimately gave rise to the universally 
recognized independent languages of the European metropole, in all their vari- 
eties: English, German, Dutch, and Frisian comprise the West Germanic group; 
Danish, Norwegian, and Swedish - together with the insular Scandinavian 
languages of Faroese and Icelandic — make up the North Germanic group. A third 
group, East Germanic (of which Gothic is the principal representative), has 
been extinct since at least the sixteenth century. None of these languages 
has developed without contact with others, but the mechanisms of linguistic 
diversification are conventionally understood in terms of primary hybridization 
(in the sense of Whinnom 1971) following the separations of peoples from the 
core group. Accordingly, the structural distance between the daughter dialects is 
the aggregate of a series of (mostly) incremental system-internal mutations that 
have taken place during the uninterrupted intergenerational transmission of 
grammar over a period of some two millennia. Yet, none of this is as straight- 
forward as it sounds. 
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2 Non-Indo-European Substratal Influences 


The presence of non-Indo-European languages in areas in which Indo-European 
dialects would eventually become superordinate has been established beyond 
doubt in many instances. Whether or not there was a linguistic substratum for 
Germanic has long been a matter of dispute (see Prokosch 1926, for a considered 
assessment of older theories; also Neumann 1971; Vennemann 1984; Mees 2003); 
but nowadays most authorities acknowledge that there must have been one. Polomé 
(1990: 337, emphasis in original) stakes out a strong position regarding what the 
Indo-European vanguard encountered in northern Europe: “There is no doubt that 
whichever way northern Europe was Indo-Europeanized, the new population 
initially constituted a mere adstratum or superstratum to a long-established set of 
peoples. When and why the language shift took place remains widely an open 
question, but one thing is certain: it did not take place without leaving clear traces 
of the prior language(s) in the lexicon.” 

The most compelling argument rests on a substantial number of terms in 
Germanic that do not admit satisfactory Indo-European etymologies. Consider, 
for example, two cross-cultural terms in Germanic that are loans from coterri- 
torial languages. As Hamp (1979) has shown, the Germanic term for ‘apple’ 
(Crimean Gothic apel, ON’ epli, OE zppel, OHG apful) and its congeners in Celtic, 
Baltic, and Slavic point to a prehistoric *oblu-, clearly a non-Indo-European form; 
on the term for ‘apple’ see Markey (1988; 1989b: 599-600); also Vennemann (1998: 
132-4). The source for the Germanic word for ‘silver’ (Go. silubr, ON silfr, OE 
seolfor, OS silubar, OHG sil(a)bar) and cognates in Baltic and Slavic is obscure but 
may be due to substratal contact with Vasconic; cf. Basque zilar, zidar (Biscayan) 
(Polomé, 1987: 229; Vennemann, 1997: 881-2). Feist (°1924: 88) estimated that up 
to a third of Germanic vocabulary is of non-Indo-European origin, and this figure 
is sometimes referenced in the literature (e.g. Witczak 1996: 171-2; Salmons 
2004). Prokosch (1939: 23) thought that further etymological analysis would 
reduce that figure “to a negligible quantity,” while Vennemann (2000: 233) opines 
that such research would take it much higher, to more than half and possibly 
even three-fourths. Interestingly, Salmons (1992: 107-8, following Bird 1982: 119) 
has determined that 67.4 percent of Pokorny’s (1959) Indo-European roots are 
represented in Germanic, which is at once consistent with Feist’s estimate but turns 
out to be the highest retention rate of all the Indo-European daughter dialects. 
Salmons (2004) cautions that even when lexical items appear to satisfy the cri- 
teria for substratum status (set out by Polomé 1989: 54-5), they could still reflect 
internal neologisms. 

Some linguists have attached great importance to prehistoric contact between 
Indo-Europeans and non-Indo-European strata of population, positing contact 
of sufficient intensity to leave an imprint on the structure of a pre-Germanic 
recipient language in the course of language shift. The Germanic consonant shift 
has sometimes been considered a phonological transfer from a substratum (e.g. 
Giintert 1934: 72; Witczak 1996: 167-9), although mainstream comparative Germanic 
linguistics has generally resisted such proposals (cf. Polomé 1992: 77-8). Feist (1928; 
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1932) attributed at least part of the structure and lexicon of Germanic to a con- 
tact situation that arose along the trade routes between northern Europe and 
the Mediterranean. Accordingly, Illyro-Venetic speakers colonized territories in 
the southern Baltic region and Indo-Europeanized a pre-Germanic population, which 
had previously spoken a different language. Feist (1932: 252) conjectured that “the 
mutation of consonants did not take place on Germanic territory, but [rather] 
Indo-European was passed on to the Pre-Germans with its consonants already 
‘mutated’.” The intermediaries for this superstratal imposition of a phonological 
pattern onto the indigenous community were perhaps the Veneti mentioned by 
Tacitus in eastern Germany (Germania, ch. 46), whom he failed to identify as a 
Slavic tribe (Polomé 1980: 194). Though antiquarian, Feist’s substratum studies 
have drawn comment in contemporary scholarship (cf. Polomé 1970b: 49; 1979: 
68; 1980: 186; 1985: 49; Mees 2003: 19-21). It has been asserted that Germanic 
is not particularly archaic (Schutz 1983: 310; Beekes 1995: 29). That assertion appears 
to be grounded in the longstanding assumption that Germanic lost a number of 
original Indo-European verbal categories — such as the imperfect, the aorist, the 
forms of the subjunctive, and the mediopassive (save for a vestigial presence in 
Gothic) — in addition to a reduction of the nominal case system and radical changes 
in phonology (cf. Polomé 1979: 681; 1987: 234). 


3 Europa Vasconica et Semitica? 


Although the presence of an autochthonous prehistoric population in northern 
Europe is probable, the identity of its peoples and their language(s) and culture(s) 
have proved elusive. In numerous publications (summarized in Vennemann 
2003; this volume), Theo Vennemann tackles the fundamental questions that 
arise in connection with pre-Indo-European strata, as part of a research program: 
Who were these people? Where did they come from? Which languages did they 
speak? In Vennemann’s theory, languages of three filiations were spoken in pre- 
historic Europe north of the Alps: Old European (a branch of Vasconic, of which 
Modern Basque is the sole survivor); Atlantic (a branch of what Vennemann 
calls “Semitidic,” of which Hamito-Semitic (Afro-Asiatic) languages constitute a 
daughter subfamily); and West Indo-European. 

The Vasconic Old Europeans dispersed into western, central, and eastern Europe 
starting already in the eighth millennium BCE; they eventually became adstrata 
and then substrata of the other languages. The Vasconic legacy resides in the Old 
European hydronymy - which Krahe (e.g. 1964) construed as Indo-European — 
and toponymy, lexical items for which there are no tenable Indo-European ety- 
mologies, the vigesimal numerical system in Danish and other European languages, 
and the initial-syllable accent of Germanic, Italic, and Celtic (Vennemann 2003: 
324-5). From the fifth millennium BCE onward, seafaring Semitidic peoples 
migrated north along the Atlantic littoral to all the islands and up the navigable 
rivers as colonizers, leaving visible traces in Europe in the form of the Megalithic 
culture. The Semitidic Atlantic languages were, initially, in their areas of 
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concentration, superstrata and adstrata. In the west their presence affected the 
Vasconic Old European languages. Indo-European became supersrata and 
adstrata everywhere in its spread range, except for the continental northwest 
and north, where Indo-Europeans had arrived before the Atlantic peoples. In 
Germania Indo-European became a substratum of a dominant Semitidic language. 

Regarding these views of language contact, Vennemann has not had many fol- 
lowers. This may be due partly to his propounding some rather bold claims about 
language contact in the remote past, which inevitably invite serious disagreement 
on matters of methodology, interpretation, and fact. Whatever it may fail to account 
for and with due allowance for lexical borrowing from non-Indo-European 
source languages, the standard view that the Germanic strong verb system is the 
result of internal morphological change will remain the preferred explanation 
(cf. Baldi & Page 2006: 2201-4). But resistance to Vennemann’s position is due 
not only to its contrariness with regard to mainstream opinion. Proponents of 
substratum explanations have not (explicitly) attached much importance to 
the theoretical underpinnings of reconstructed language contact. Nevertheless, 
Vennemann’s attempt to identify and empirically ground hitherto unknown, 
unattested linguistic strata is in itself a laudable endeavor. 


4 Germanic Contacts with Finno-Ugric 


In addition to encounters between dialects within its own genetic spread, north- 
west Indo-European came into a sustained contact with Finno-Ugric at the begin- 
ning of the first millennium BCE or perhaps even as early as the latter half of the 
second millennium BCE (cf. Polomé 1992: 82; Salmons 1992: 82; Koivulehto 2002: 
583-5). This contact continued “apparently without noticeable interruptions” 
through the pre-, Proto-, and Common Germanic stages and would extend 
through the historical periods until the present time (Koivulehto 2002: 590). 
Germanic influences manifest themselves most visibly in the form of loanwords 
in (mainly) Finnic and/or Sami. These loans pattern themselves in chronological 
layers, which can be adduced from the phonological features that the words 
exhibit in the recipient languages. The lowering of PGmce. *é, to *@ took place in 
Northwest Germanic but not Gothic, as we see in PGmc. *mén- ‘moon’ > Go. mena, 
ON mani, OHG/OS mano, OE mona (-6- < -a- before nasal). This change took place 
after the breakup of Proto-Germanic and is assignable perhaps to the first cen- 
tury BCE (see Koivulehto 1981). Markey (1999: 146-53) analyzes Go. meki ‘short 
sword’, ON mekir, OE méce, OS maki ‘sword’ as reflexes of a cross-cultural term 
for ‘sword’ that is ultimately related to Greek pdjyoupa ‘knife, dagger’ and entered 
Germanic as a secondarily suffixed *meg-(i)yo- (from a primary root *meg-) with 
the lengthened grade. From Germanic, *meék(i)ja- ‘sword’ spread to neighboring 
Slavic and Finnic dialects. We may infer from the vocalism of Finnish miekka ‘sword’ 
that the latter was part of an early layer of borrowings from Germanic, miekka 
having been adopted in the mid first millennium BCE but obviously prior to the 
lowering of PGmc. *2, to “4. Similarly, Go. paida ‘tunic, shirt’, OHG pheit ‘garment’, 
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OS peda ‘garment’, OE pad ‘coat’ are generally thought to reflect Greek Baitn, itself 
probably a cross-cultural word (cf. Lehmann 1986: 271) that reached Germanic 
prior to the consonant shift; Finnish paita ‘shirt’ is a borrowing from Germanic in 
which d > ¢ is regular. 

Owing to the conservatism of some Finnish dialects over the past two millennia, 
the oldest Germanic loanwords constitute important evidence in the reconstruc- 
tion of Proto-Germanic, even if their value has been overestimated at times (see 
Juntune 1973, for discussion). Finnish rengas ‘ring’, for example, is supposed to 
confirm *hrengaz as the correct reconstruction of this lexical item in Proto- 
Germanic, whereas all attested Germanic languages show i in the tonic syllable 
before a nasal plus a consonant (ON hringr, OHG/OE/OS hring). Finnish rengas, 
kuningas ‘king’ (< PGmce. *kuningaz; cf. ON konungr, OHG cuning, OS kuning, OE 
cyning) appear to preserve the Proto-Germanic masculine nominative singular 
ending *-az (< PIE *-os), which, following the fixation of accent on the first or root 
syllable, became reduced or lost altogether in the daughter dialects. 

Koivulehto has proposed numerous etymologies (surveyed in Koivulehto 
2002: 585-90) that would show the Germanic contribution to Finnic and Sami 
lexicons to be sizable; see further Hofstra (1985) and also de Vries (1977: 
xxxiv—xli), who includes younger, specifically North Germanic loans in his lists. 
The deep lexical impact of early Germanic on Finnic might lead one to question 
whether the standard view of adstratal influence adequately captures the contact 
situation in prehistoric northwestern Europe. It has in fact been suggested that 
Germanic also influenced Finnic phonological development owing to the existence 
of a Germanic-speaking superstratum in regions populated by Finnic groups, 
a view that has become widely accepted in Finno-Ugric historical linguistics 
(Posti 1953; Koivulehto 2002: 590-1; but see also Kallio 2000). Koivulehto (2002: 
590) seeks to explain the Finnic consonant gradation (the lenition of p, t, k, s in 
certain environments) as the culmination of internal phonetic tendencies that 
were enhanced by external influence, namely, from a Germanic Verner’s Law that 
antedates the Germanic consonant shift proper (cf. Koivulehto & Vennemann 1996), 
which of course is in itself a controversial proposal. The unidirectionality of early 
Germanic lexical transfer suggests that this putative Germanic superstratum did 
not maintain its position for a long period and was gradually assimilated into the 
native population. 


5 Language Contact within the Northwest 
Indo-European Spread 


Linguistically, Indo-European “constitutes a complicated blend in which the pro- 
portions of the common elements vary greatly from branch to branch” (Prokosch 
1939: 23). Some Indo-European language groups show greater affinities to one 
another than to other groups within the family. Comparative linguistics has 
long been concerned with (inter alia) the earliest relationships between the 
ancestors of the various branches of Indo-European. Germanic shares significant 
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commonalities with Baltic, Slavic, Celtic, and Italic, along with some possible links 
to Illyrian and Venetic; see Polomé (1970a; 1970b; 1972) for detailed discussion; 
also Nielsen (1989: 18-28) for a concise survey. These correspondences are pre- 
ponderantly lexical, though grammatical parallels are also to be discerned. 

There is, however, a great deal of uncertainty about prehistoric Indo-European 
groupings between the breakup of the hypothetical linguistic unity and the 
formation of strongly diversified daughter dialects. Not only do we know far too 
little about the origins of Germanic, “there probably existed many more languages, 
more linguistic kernels or nodes, than we can perceive, both Indo-European and 
non-Indo-European” (Evans 1981: 252). One must further reckon with the possi- 
bility of koineization between co-territorial varieties following the reintegration 
of groups that had formed in the course of earlier migrations all along the Indo- 
European spread, with larger units in turn splitting up into new speech communities. 
Regional linguistic correspondences may be due to (1) shared retentions from Proto- 
Indo-European and/or (2) the diffusion of innovations from secondary centers 
established by successive waves of emigrés out of the core speech community. 
(3) So-called Wanderworter (e.g. Neumann 1971: 96; Markey 1989b; Polomé 1992: 
72; Vennemann 2003: 324; Salmons 2004) represent a special category of loanwords 
that spread across languages, usually in connection with trade or the adoption of 
external technological, economic, or cultural practices. It is also to be expected 
that (4) cross-dialectal correspondences are either borrowings or convergences 
resulting from contact between Indo-European groups that settled in adjacent 
territories. But because we cannot establish with precision what position should 
be accorded to Germanic in the Indo-European complex, it will be difficult in many 
cases to know for sure whether a given correspondence came about due to common 
heritage, parallel development in genetically related dialects, or transfer from one 
language to another. 

For expository purposes, we shall proceed from the hypothesis that the second 
and first millennia BCE were a period of gradual divergence and contact between 
the northwestern Indo-European dialects. We can divide the early Germanic 
lexicon into essential components that reflect all four potential correspondence 
sources previously identified. 

(1) There is a considerable body of lexical items that are directly relatable to 
Proto-Indo-European etyma, most notably in the core domains of society (i.e. its 
settings and subdivisions) and economy (see Lehmann 1968). 

(2) There are various sets of words that connect Germanic to other Indo- 
European languages, either individually or in macrodialectal subgroups. 
These two lexical components bleed into one another to the extent that the forms 
in question may be derivable from roots that can be traced back to Proto-Indo- 
European. However, Polomé (1972: 45) stresses that the latter category of words 
is “particularly significant as indices of diachronically staggered areas of closer 
relationship within the IE community.” Consider, for example, PIE *bhar- ‘spinous, 
prickly’ (Markey 1984; 1989b: 589-93), *bhares- ‘barley’ (with “bhar- referring to 
the ‘bristles of the ears’, as per Polomé 1992: 69), which are attested in west (Italic, 
Celtic) and north-central (Germanic, Baltic, Slavic) Indo-European dialects and reflect 
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a common cultural development: PGme. *bariz-, in Go. barizeins ‘prepared with 
barley’, ON barr ‘grain’, OE bere ‘barley’, beren ‘of barley’; Latin far, gen. farris ‘spelt, 
grain, meal’, farina ‘meal, flour’, farreus ‘made of spelt or corn’; Russian bor 
‘millet’ and possibly Old Church Slavonic brasno ‘food’, Russian borosno ‘rye flour’; 
Welsh/Cornish/Breton bara, Old Irish bairgen ‘bread’ (Pokorny 1959: 108-10; 
Polomé 1970a: 57; 1972: 47; 1992: 69; Lehmann 1986: 62). The set may well rep- 
resent a northwestern Indo-European Wortsippe designating ‘food derived from 
cereals’ (Polomé 1972: 47). Barley and other grains were long known to the Indo- 
Europeans. Polomé (1992: 69) points out that *bhar- could ultimately trace its 
origin to a word belonging to a northern European substratum of the northwestern 
Indo-European group. Markey (1989b: 393), however, suggests that a northern 
European substrate term *b(h)e-u-, preserved in ON bygg, OE béow ‘grain, barley’, 
coexisted with Indo-European “bhar-. If so, then *bhar- would be construed as a 
purely descriptive term used to emphasize a characteristic quality in cereals (‘the 
bearded, prickly grain’) as well as human facial hair (*bhar-dha ‘beard’). 

(3) Germanic also absorbed lexis (and associated items of material culture) 
that traveled through contiguous Indo-European (and non-Indo-European) lan- 
guage groups well after their individualization. The cultivation of hemp is prob- 
ably Scythian in origin, and its introduction to Germanic areas (or at least 
knowledge of the plant) antedates the Germanic consonant shift. Germanic 
borrowed the Greek term xévvaBic ‘hemp’, perhaps by way of a neighboring 
language (cf. Latin cannabis, cannabum), whence PGmce. *xanap- (< *kan(n)ab-, via 
the Germanic consonant shift) yielding OHG hanaf, OS hanap, OE henep, ON hampr 
‘hemp’. A non-Indo-European etymon for “kannab- is certainly thinkable (cf. 
Kuhn 1962: 123). 

(4) As for lexical transfer between adjacent or co-territorial Indo-European 
languages in northwestern Europe: The problem with the putative relationship 
of Germanic with Illyrian and Venetic is that “the limitation of the available mater- 
ial and the disputability of its interpretation, especially in the case of proper names, 
make any far-reaching conclusion hazardous” (Polomé 1970b: 52). Some scholars 
have discerned a close early connection of Germanic with Baltic and Slavic (cf. 
van Coetsem 1970: 29; Porzig 1954: 92). But there seem to be just 10 to 20 old 
Germanic loans in Baltic, by Koivulehto’s count (2002: 591), belonging to differ- 
ent chronological layers; and several of these are thought to have passed through 
Slavic, in which there are considerably more Germanic forms. Slavic borrowed 
the Germanic word for ‘bread’ (PGmce. *xlaibaz), as attested by Old Church 
Slavonic chlébv ‘bread’. Lithuanian kliepas ‘loaf of bread’ is conventionally seen 
as a younger loan via Belarusian (cf. Lehmann 1986: 186), though Koivulehto (2002: 
592) raises the possibility of an older borrowing directly from Germanic, which, 
a fortiori, would be compelling if Latvian klaips ‘bread’ is drawn from early 
Germanic rather than from Gothic hlaifs, gen. hlaibis. Polomé (1972: 54) cautions 
that “the acceptable lexical evidence exclusively shared by the Germanic, Baltic, 
and Slavic tribes is hardly sufficient to draw any definite conclusions as to their 
close relationship or even the level of civilization that they had reached at the 
time of their contact.” 
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Germanic, Celtic, and Italic have preserved correspondences that are sugges- 
tive of early contact between speakers of all three branches. Relations between 
Italic and Germanic peoples during the Bronze Age are suggested by the fact that 
Italic and Germanic (but not Celtic) share a common word for a metal: Latin aes, 
aeris ‘copper, copper ore, money’, Go. aiz ‘money, metal coin’, ON eir ‘copper’, 
OHG/OS ér, OE ar, &r ‘ore’. After the southward migration of Italic groups from 
the last third of the second millennium BCE, one might suppose that Celtic and 
Germanic would have developed together for a longer period as closely related, 
mutually intelligible dialects in a situation of contact (Pokorny 1936: 508). One 
problematic aspect of this hypothesis is that there are no incontestible common 
innovations between them in phonology and grammar, to the exclusion of the 
other Indo-European branches; Germano-Celtic correspondences are confined to 
the lexicon (Polomé 1972: 64; Evans 1981: 242-3). For Evans (1981: 252-3), this is 
in itself a strong indication that the two groups were not in close association until 
a fairly late period. 

From 400 BCE to 100 BCE, the advance of Germanic peoples southward 
brought them into contact with the Celts, who were moving north. It was once 
believed that Celtic hegemony over the Germanic world accounted for a long list 
of lexical correspondences (see Mees 2003: 18-19, 25-6 for references and discussion 
of the older literature). Some contemporary scholars have averred that Germanic 
must have occupied a culturally subordinate position with regard to Celtic 
(Polomé 1972: 67; Lehmann 1977: 288-9; 1987: 80). Evans (1981: 253-4) insists 
that there is no historical justification at all for assuming the domination of the 
one group by the other, but cultural preeminence does not imply political 
control (cf. Polomé 1985: 52). Salmons (1992: 95-6) observes that “subordinate” 
lends itself to different interpretations ranging from “subjugated” to “less pres- 
tigious.” Minimal “subordination” in the form of proximity to a somewhat more 
advanced civilization in terms of technology and social organization — combined 
with utility and/or novelty — would be sufficient motivation for lexical borrow- 
ing. In later work Polomé (1987: 221) adopts a more conservative approach to 
interpreting Germano-Celtic agreements in vocabulary: “The type of relationship 
that existed between the Germanic people and the Celts is... hard to define by 
linguistic means: when W. P. Lehmann [1987] argues from legal and institutional 
terminology that the Celts were the givers and the less developed Germanic peo- 
ple the receivers in the linguistic interchange, he may be overstating his case.” 

Estimates of the Celtic Lehngut in early Germanic range from 50 or 60 items 
(Salmons 1992: 96) to a mere 10 or fewer (Lane 1933: 263-4; Evans, 1981: 248). In 
keeping with Polomé’s conservatism, shared terms are best considered cognates 
belonging to a larger institutional framework of similar, possibly inherited, legal, 
social, and political structures or reflecting common regional innovations and local- 
ized archaisms — unless there is linguistic and/or cultural evidence for their being 
borrowed (Polomé 1983: 281-2; 1987: 221). The Germanic term *rik- ‘ruler’ (Go 
reiks [ri:ks] ‘ruler’, ON rikr ‘mighty’, OHG masc. rihhi ‘ruler, king’, neut. rihhi ‘king- 
dom’, etc.) is arguably an early borrowing from Celtic by virtue of its vocalism. 
Since PIE *@ is preserved in Proto-Germanic, PIE *rég- (Latin réx, régis) should be 
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reflected by PGmce. *rék-, whereas in Celtic PIE *@ was raised to 1, whence *rig- 
‘ruler, king’, Olr. 77 (gen. rig), Gaulish -rix. Note, too, that PGmce. *rik- shows the 
effect of the Germanic consonant shift, when voiced stops became voiceless (*g > 
*k). Other Germanic words that are widely accepted as Celtic loans include 
PGmce. “ambahtaz ‘servant’ (Go. andbahts, with reanalysis of the first syllable as 
the prefix and-, OE ambeht, OS ambahteo, OHG ambaht, ambahti ‘office’ > Modern 
German Amt) < Celtic *ambaktos ‘member of the retinue of an important leader’ 
(Polomé 1985: 56-7) implied by Gallo-Latin ambactus (Caesar, Commentarii de 
Bello Gallico, ch. 6, 15); and PGmce. “isarna- ‘iron’ (Go. eisarn, ON isarn (via West 
Germanic), OE isern, OS/OHG isarn) ~ *tsar, *tsan (OHG isan, OE tren), which may 
be derived from a Celtic source (*fsarno-), “since the only iron the Germanic world 
was familiar with before the impact of Celtic metallurgy seems to be hematite, 
the red iron ore designated by the term raudi (: rauda ‘red’) in Old Norse” 
(Polomé 1985: 57). Lexical transfer in the opposite direction, from Germanic into 
Celtic, amounts to perhaps two or three examples (Lane 1933: 264), e.g. Olr. séol 
‘piece of cloth, sail’, Welsh hwyl ‘sail’ from Gmc. *segla- (ON/OE segl, OS segel, 
OHG segal ‘sail’), although the hypothesis that ‘sail’ is a Germanic loanword in 
Celtic is not altogether satisfactory in Evans’ view (1981: 250-1). 


6 Language Contact Following the Breakup of 
Proto-Germanic 


Lehmann (1977: 285; similarly, 1968: 4-5) claims that one of the most notable char- 
acteristics of the Germanic branch is its lack of dialect differences at the time of 
Proto-Germanic: “The unity of Germanic is striking when compared for example 
with the diversity of Greek.” This uniformity is best accounted for by assuming 
“a stable cohesive community, presumably that located around the Baltic for a 
millennium or more.” Major innovations — such as the fixation of accent, the 
Germanic consonant shift (Grimm’s Law), and adjectival inflection (strong /weak) 
based on definiteness rather than stem class — diffused across the entire speech 
community; dialect differences arose in Germanic at a late period. Yet, imputation 
of a single, unified speech community largely free of dialect variation to Germanic 
prehistory (Lehmann 1968: 4; 1977: 285-6) is probably an oversimplification (cf. 
Polomé 1972: 45). 

The actual situation may have been more akin to a chain of interrelated 
dialects with no clear internal boundaries. The traditional visualization of such a 
unity incrementally fracturing into three discrete subgroups (North, West, and 
East Germanic) without contact or mutual influence has long been abandoned 
(Kufner 1972: 94-5; Scardigli 2002). Although there are isoglosses that connect North 
and East Germanic, North and West Germanic exhibit a numerically preponderant 
set of linguistic affinities that are not shared by East Germanic and are more heav- 
ily weighted in their probative value. East and West Germanic have virtually no 
isoglosses in common with each other, save for some pronominal forms in Gothic 
and Old High German (Go. 3 sg. masc. nom. sg. is, acc. sg. ina, OHG ir/er, in(an) 
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beside OS he, ina(n), OE hé, hine; ON nom./acc. hann). A period of common devel- 
opment in North and West Germanic languages after the migration of the Goths 
from eastern Scandinavia is widely assumed, at least as a working hypothesis (e.g. 
Kuhn 1955; Antonsen 1965: 36; Kufner 1972: 94; Haugen 1976: 111-12; Ringe 2006: 
213). For its part, Gothic developed more or less separately along the shores 
of the Baltic Sea, east of the Vistula, from the second century BCE to the second 
century CE. Intermediate relationships in Germanic, which have long been the 
subject of scholarly debate (see especially Nielsen 1989: ch. 4 for a critical review 
of the literature), do not directly concern us here. Suffice it to say that Germanic 
peoples “lived at various times after the PGmce. stage in differing constellations 
of proximity, [and] retained close contacts with each other even after migrations, 
so that linguistic borrowing [between dialects] is likely to have taken place for 
long periods of time” (Kufner 1972: 74). 

With fragmentation into dialects and increasing temporal remove from the ances- 
tral language, the concept Germanic comes to refer not to a historical object but 
rather a genetic affiliation. The history of contact in Germanic languages becomes 
the diachronic record of discrete speech communities, the outcomes of which depend 
on the nature, duration, and intensity of the linguistic encounters. Salmons (1992: 
90) makes essentially this point when he writes that by the early Roman period, 
Celtic-Germanic contacts cover much less territory than earlier contacts might have, 
indicating greater complexity. 

The evidence from the historical sources makes it clear that in Roman times 
contact continued between Celts and Germanic peoples living along the common 
frontier between them. Caesar described Gaul at the time of his conquests (58-51 
BCE) as being divided into three parts inhabited by the Aquitani, the Gauls (who 
in their own language are the Celts), and the Belgae, each differing from the other 
in language, customs, and laws (Commentarii de Bello Gallico, ch. 1, 1). His sources 
informed him that one Belgic tribe, the Remi, are descended from Germanic 
peoples whose forebears had crossed the Rhine to settle in a more fertile region, 
driving out the Gaulish natives; this tribe is said to have resisted the Cimbri and 
Teutoni, who would go on to harry Roman territories from ca. 113-101 BCE Other 
Belgic tribes — the Condrusi, Eburones, Caeroesi, Paemani — are also said to be of 
Germanic descent (Commentarii de Bello Gallico, ch. 2, 4; ch. 6, 32). Caesar’s report 
has fueled a great deal of speculation over whether the Belgae were an ethnically 
mixed Gaulish-Germanic group (e.g. de Vries 1960: 51). Tacitus retreated some- 
what from his (in)famous supposition that the Germani were an ethnically 
homogeneous, indigenous people (Germania, chs. 2 and 4) when he alludes to migra- 
tion, mixing, and language shift on the part of certain groups. 

During the first centuries of the common era, Latin was the medium of com- 
munication in the “vertical” domains of administration, cultured discourse, and 
the military in territories west of the Rhine and south of the Danube. One is tempted 
to conjecture that there must have been a Latin-based lingua franca or trade 
jargon — mixed with Germanic and Gaulish forms - along the long frontier 
between the Roman Empire and unoccupied Germania and along the North Sea 
coast. Unfortunately, that question is largely moot. Roman authors took little 
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interest in linguistic matters; “the amount of information we have about the com- 
plex language situation in the Empire is appallingly poor” (Polomé 1980: 193). 
Multilingualism was probably the norm in the Roman towns and trading centers 
along the Rhine, Moselle, Meuse, and Danube, and among Germanic tribes reset- 
tled on Roman territory, like the Ubii (cf. Polomé 1980: 194). 

The domination of the Romans and the preeminence of their culture assured 
that the influence of Latin would be substantial. Northern Gaul and Roman 
Germania were constitutive of an economic and cultural sphere within the 
Roman Empire, out of which a plethora of Latin loanwords entered West 
Germanic dialects in the domains of administration (OHG keisar ‘emperor’ < Latin 
caesar), the military (OE comp, OHG kampf ‘battle’ < Latin campus ‘plain, field, 
battlefield’), the domestic sphere (OE pytt, OHG pfuzzi, MHG pfiitze ‘well, pit’ 
< put-tius < putjus < Latin puteus), commerce (OS munita, OE mynet, OHG 
munizza ‘coin’, MHG miinze < Latin moneta ‘coined metal, money’), and agricul- 
ture (OHG/OS fruht ‘fruit’ < Latin fructus ), and some days of the week (e.g. diés 
Saturni, English Saturday, Dutch zaterdag). The early adoption and nativization 
of these words are indicated by the effects of the West Germanic consonant 
gemination, the High German consonant shift, and umlaut. While the assimila- 
tion of -mp- to -pp- in ON kapp ‘contest, ardour’ is indicative of a relatively early 
borrowing, Latin loans in Old Norse are generally later and mediated by contacts 
with northern German (ON fruktr ‘fruit’) or English (ON mynt ‘coin’, pyttr ‘pit, 
pool’, though on the latter see de Vries, 1977: 430-1). Loan exchange in the oppo- 
site direction, from Germanic to Latin, was negligible. 

The first direct testimonials to any kind of Germanic language are inscriptions 
in the runic orthography beginning in the first century CE, most of which were 
found in present-day Denmark, Norway, and Sweden, but with scattered finds 
in Germany, France, and eastern Europe, along the route of East Germanic 
migrations. The prevailing theories on the origin of the writing system tend to 
fall into three categories: (1) it is an adaptation of — or autonomous creation inspired 
by — the Latin alphabet, on the impetus of increased cultural contact at the turn 
of the present era between Germanic peoples and Romans living along the Rhine; 
(2) it is based on North Etruscan epigraphic practice that reached northern 
populations along the amber trade route between Italy and the Baltic coast; 
(3) it is derivable from a Mediterranean script (Latin or Greek) that was imported 
to the north by sea rather than passing through the European continent by land 
(cf. Antonsen 2002: 116). Whichever position one favors, one must recognize that 
individuals in the borrowing oral culture had to have learned the language(s) of 
the literate culture and did not casually imitate a foreign writing system. 


7 The Migration Period (ca. 200-600 CE) 


The emergence of the Germanic peoples upon the scene of recorded history “as 
a progressively intrusive migratory movement was one of the most fundamental 
ethnographic events in early European history. Quite obviously, the cultural face 
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of Europe was to be forever altered by this intrusion” (Markey 1986: 248-9). The 
origins and causes of Vélkerwanderung are variously identified as overpopulation 
and competition for resources in the homeland, coastal inundation due to climate 
change, the whiff of wealth from the south, and a power vacuum created by a 
Roman Empire in retreat and inexorable decline. The linguistic consequences were 
language shift, language death, and language spread. 

Early in the present era a number of Germanic tribes settled in the eastern Baltic 
region and along the Elbe migrated southward. At about 200 CE, the Goths left 
their homeland along the lower Vistula and settled in the plains north of the Black 
Sea, eventually occupying a swath of territory between the mouth of the Danube 
and the Don River. One major group, the Visigoths pressed into Dacia and caused 
a Roman withdrawal from the province (275). The Hunnic invasions (from ca. 370) 
forced the Visigoths to take refuge in Lower Moesia (now eastern Bulgaria). Under 
Alaric (ca. 370-410), they marched into Greece and Italy, sacking Rome in 410, 
and from there wandered through southern Gaul into the Iberian peninsula, 
reaching southern Spain by ca. 415. Three years later they re-entered Gaul and 
established a kingdom at Toulouse that encompassed Aquitania and Narbonne. 
Visigothic expansion was checked by the Franks under Clovis I (ca. 466-511) in 
the Vouillé in 507. Meanwhile, the other major group, the Ostrogoths, recovered 
their independence after the repulse of the Huns in 451 in the Battle of Chalons 
and the death of Attila (406-53). They penetrated southeastern Europe and in 493, 
under Theodoric (454-526), overthrew Odoacer (436-93), the first Germanic king 
of Italy, seizing Ravenna as their capital. In 535 the Byzantine emperor Justinian I 
(ca. 482-565) commissioned a war of restoration in the former Western Empire. 
With the final defeats of the Ostrogoths in the 550s and the end of Visigothic 
domination after the Moorish conquest of Spain (711), the Goths disappear from 
history. 

The Vandals (Vandilii in Tacitus, Germania, ch. 2) were a collective East Germanic 
group that, like the Goths, appear to have had their origins in Scandinavia (pre- 
sumably Jutland). In the last quarter of the third century, the Vandals migrated 
south into Dacia and Pannonia. Under pressure from the Huns, they entered Gaul 
(406) with the support of another Germanic group, the Suebi, and both groups 
crossed the Pyrenees into Hispania in 409-10. The incursion of the Visigoths spelled 
the absorption of the Suebi and forced the Vandals into northern Africa where 
they besieged Hippo in 430 and captured Carthage in 439. Over the next three 
and a half decades the Vandals raided the coastal areas of the Eastern and 
Western Empires, sacking Rome in 455. The restoration of imperial control over 
northern Africa in 534 meant the expulsion, dispersal, and enslavement of the 
Vandals. During the course of the third century, the Burgundians — yet another 
East Germanic group, the origins of which are traditionally fixed in Scandinavia 
(to wit, the island of Bornholm) — drifted westward from their secondary home 
in the Vistula basin. By the mid fifth century they had settled in what is today 
the French-speaking part of Switzerland and in the south of the French Jura around 
Geneva. From there they occupied Lyon and then spread out over the Rhone region 
toward the south. The Langobards, a Herminonic group, departed from the Elbe 
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and established themselves in northern Italy in 568, in the aftermath of the 
ruinous Ostrogothic wars. Their kingdom fell to Charlemagne in 774. 

The philological record for Mediterranean Germanic is sparse. Native texts 
are extant only for Gothic, which is represented by the fourth-century New 
Testament translation of the Visigothic bishop Wulfila (ca. 310-83) — preserved 
in manuscripts from the late fifth and sixth centuries — alongside remnants of a 
commentary on the gospel of John and minor fragments. Some of these documents 
very probably originated in Italy, others perhaps in southern France or in the Danube 
region. For the other Mediterranean Germanic dialects, we have only the isolated 
words and names that are recorded in historical and geographic writings from 
Late Antiquity. Nevertheless, it is reasonable to suppose that Germanic language 
varieties exported south of the Alps, to northern Africa, and to the Balkans were 
conglomerations of dialects with varying degrees of koineization and mutual 
intelligibility, already “well mixed and stirred before or immediately after 
confrontation with the non-Germanic languages of the Mediterranean basin” 
(Markey 1989a: 62). Though it shares certain features with Upper German of the 
time, Langobardian is known to have included in its ranks Franks, Saxons, 
Goths, Burgundians, Gepids, the last of the Rugians (an East Germanic tribe) 
defeated by Odoacer in Noricum (487), as well as various non-Germanic peoples 
(cf. Markey 1989a). Biblical Gothic evinces Greek influence in the design of the 
writing system, lexis, and word order; occasional Latinisms are also discernible. 
Van Coetsem (2000: 200-12) characterizes the structure of Gothic as “eminently 
Germanic,” even though it has accelerated a pre-existing trend toward what 
he calls “regularization” (reduction in the number of inflectional and morpho- 
phonological variants) and “uniformization” (reduction in the number of categorial 
distinctions, primarily in morphology). The accelerating factor obtains from the 
unstable social situation implied by migration, interaction with other languages, 
and the absence of institutional norm enforcement. Even so, the extent to which 
the high variety of religious texts reflects vernacular forms of the language is debat- 
able. Generally, Germanic Mediterraneans were transitory, minority superstrata 
that were absorbed into the indigenous populations, leaving only slight onomastic 
and lexical traces. That the Flemish nobleman Ogier Ghislain de Busbecq (ca. 
1520-92) could interview two semi-speakers of Gothic in Constantinople in the 
early 1560s leads us to think that some Gothic settlements may have existed as 
late as the sixteenth century in the Crimea. Elsewhere, the enclavement that pro- 
motes language maintenance on foreign soil was never a significant long-term 
factor. 

Meanwhile, two Germanic groups of composite origin appear on the scene in 
the third century. The Alamanni (‘all men’) were an amalgamation of Germanic 
tribes between the upper Danube and the middle Rhine who entered into 
prolonged conflict with the Romans. Between 300 and 450, they pressured and 
eventually succeeded in occupying Roman territories in southwestern Germany, 
present-day Switzerland, and Alsace. Another tribal confederation, the Franks, 
moved into Germania Inferior and northern Gaul during the fifth century. In 486 
Clovis became the absolute ruler of a large kingdom of mixed Germanic and 
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Romance populations, one that would over the next several decades incorporate 
the Alamanni and Bavarians. By 537 virtually the whole of Gaul was under the 
authority of the Franks, by which time the Gaulish language was on the cusp of 
extinction. Romance and Germanic vernaculars mingled with one another in the 
plurilingual areas of Gaul during Merovingian times. Assimilation of the Frankish 
superstratum proceeded from the Latinization of vertical communication (ie. 
between supraregional institutions and local units) starting in the fifth century 
to the Romanization of horizontal communication (between and within local 
affiliations) by the eighth century; cf. Banniard (1996), van Durme (2002: 12). The 
Strasbourg Oaths of 842, in which two sons of Ludwig the Pious (778-840) swore 
allegiance to one another against their older brother Lothar in the language of 
the army of the other — Ludwig-fils in Gallo-Romance and his half-brother Charles 
in German — are emblematic of the formation of a linguistic border. 

The Bavarians are an Elbe-Germanic group (descended perhaps from the 
Marcomanni) that entered history after they crossed the limes in the mid sixth 
century. Their name appears to be derived from that of a Celtic tribe, the Boii, 
which would suggest Bohemia as their initial expansion area. A handful of 
lexical agreements raise the possibility of ancillary contact with Gothic with the 
introduction of Christianity, e.g. Go. paraskaiwe ‘preparation’ (Greek napacKebv), 
OHG (Bavarian) pherintag ‘Friday’; OHG sambaztag ‘Saturday’ (Modern German 
Samstag), Go. sabbato dags ‘day of rest’. 

The North Sea coast served as the staging area for yet another great migration 
that gave rise to the first permanent extraterritorial Germanic language of the com- 
mon era. This population movement began with the Saxons (yet another merged 
group that first appears in the mid second century), who raided northern Gaul 
and southeastern England from the latter part of the third century. The emigra- 
tion of small parties of North Sea Germanic peoples across the English Channel 
from the early to mid fifth century intensified into a large-scale colonization of 
Britain by Saxons, Angles, Frisians, and Jutes. The linguistic consequences for the 
Romano-Celtic population were rather more pernicious than was the case in speech 
communities once dominated by Germanic Mediterraneans. As early as the fifth 
century, Latin faded away along with the Roman superstratum that introduced 
its language. For Britons who were not killed or displaced during the invasions, 
the Anglo-Saxon conquest meant asymmetrical bilingualism, gradual language 
shift, and eventual assimilation. The areas occupied by the southwestern Britons 
were reduced by the expansion of Wessex in the centuries following the Battle of 
Deorham in 577. Further to contact and the early history of English, see Filppula 
(this volume). 


8 The Projection of Norse and Norman Power 
(ca. 700-1100) 


Norse expeditions of the so-called Viking Age were an exercise in wealth accu- 
mulation by multiple means — trade, piracy, colonization — that were enabled by 
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advances in maritime technology. Of linguistic significance is the exportation of 
North Germanic varieties of the period, known collectively as the donsk tunga lit. 
‘Danish’ but more precisely ‘Norse tongue’ (Bibire 2001: 89), to territories outside 
of mainland Scandinavia. Norse was introduced to the eastern shores of the Baltic, 
Finland, and deep into Russia, where Swedish traders known as rus (ostensibly 
derivable from Finnish Ruotsi ‘Sweden’, ON rédr ‘rowing’) and Scandinavian mer- 
cenaries constituted a superstratum that melted into the Slavic population after 
the mid tenth century. It is unclear to what extent the Swedish spoken natively 
by nearly 300,000 people in Finland and once spoken in Estonia (until World War 
II) derives from language spread during the Viking Age vis-a-vis secondary 
migration in the Middle Ages (Barnes 2005: 183). 

The ninth century saw an influx of (mainly) Norwegians in Shetland, Orkney, 
the Hebrides, Isle of Man, parts of Scotland and Ireland, the Faroe Islands, and 
Iceland. In areas where settlement was dense and the immigrants formed cohe- 
sive polities, Norse may have continued in use for many generations. In the first 
two offshore island groups and on the north coast of mainland Scotland, a 
descendant of the immigrant language called Norn was maintained until the eigh- 
teenth century. But as a rule, Norse disappeared, though not without leaving its 
mark on the local languages to which it succumbed. Only in the hitherto sparsely 
populated Faroes and Iceland did Norse take permanent root and develop into 
new languages. Icelanders and Norwegians established toeholds in Greenland 
(ca. 986) and from there explored parts of eastern Canada. From archeological 
evidence, it appears that there was robust trade between the Greenlandic 
Norse and native people, which hints at an erstwhile jargon. Today, the sporadic 
Old Norse loanwords in Inuktitut (catalogued by van der Voort 1996) are the 
monument to the Norse speech community in Greenland, which perished in the 
fifteenth century. 

Danish and Norwegian invaders settled in the north and east of England 
between 865 and 955. The campaigns (991—4) of Olaf Tryggvason (later king of 
Norway, d. 1000) and King Svein Forkbeard (960-1011) of Denmark installed 
Danish sovereignty over England, which would continue through the reigns of 
Svein’s son Canute (1014-35) and grandson Harthacanute, who died without 
issue in 1042. During this same period, the Danish (Norwegian?) chieftain Rollo 
(ca. 860-932) and his descendants consolidated and expanded their hold on 
Normandy, only to have their community become part of the French-speaking 
stream. 

Our knowledge of what was surely an intense contact situation in the English 
Danelaw is grounded in inferences made from the heavy borrowing of content 
words (e.g. Modern English egg, sky, ill, take, get, die from ON egg, sky ‘cloud’, illr 
‘evil, bad’, taka, geta, deyja via OE diegan, replacing steorfan, which survives in Modern 
English starve) along with some closed-class functional items, viz. the prepositions 
till, fro (in to and fro) < ON til, fra ‘from’, and 3 pl. pronouns they, them, their < 
ON peir, beim, peirra (supplanting OE nom./acc. pl. hie, dat. pl. him, gen. pl. 
hira/heora), and place-name elements (e.g. -by, ON byr ‘farm, town’). The presence 
of Norse speakers may have facilitated the selection of are < Midland/ Northern 
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ME are(n) < OE (Mercian) earon (cf. ON 1 pl. erum, 2 pl. erud, 3 pl. eru) at the expense 
of the southern plural present indicative form of ‘be’ (OE sindon, sint). Language 
shift from Norse to English in the Danelaw was probably largely complete by 1100 
(Thomason & Kaufman 1988: 267, 282). 

French-speaking Normans with Breton and Flemish allies conquered England 
between 1066 and 1070 and proceeded to Ireland a century later. As the language 
of the court, the higher nobility, of the legal and ecclesiastical superstructure, and 
of European culture, French enjoyed enormous prestige until well into the four- 
teenth century, by which time the resident body of French speakers in England 
had become largely assimilated. English came to acquire an immense number of 
French borrowings (as well as Latin and Greek forms through the medium of 
French) along with a considerable amount of nonnative derivational morphology 
(e.g. in-, -ive, -(at)ion, -al, -ic, -ity, -ment, etc.) with accompanying morphophono- 
logy (/d ~ f/ in accede : accession, /d ~ 3/ in decide : decision, /k ~ s/ in electric : 
electricity, /t ~ 3/ in equate : equation). Some morphemes are restricted to combi- 
nations with other borrowed elements (-ceive, as in perceive, receive, deceive), while 
others can combine with varying degrees of freedom with native or non-Latinate 
morphemes (drinkable, nonwhite, remake, botheration, de-Baathify). 

The Norse and Norman contribution to the development of English has been 
thoroughly described in the standard handbooks and in the essay by Filppula 
in the present volume. Suffice to note here some points of episodic controversy. 
Some linguists would relate the lexical replacement (relexification), inflectional 
simplification, and concomitant analycity (with fixed SVO order) that we observe 
in Middle English to parallel phenomena in language contact situations that give 
rise to creoles (cf. Milroy 1984: 11; though see now Milroy 2007: 17-18). 

Thomason and Kaufman (1988: 263-342) devote a not inconsiderable portion 
of their seminal study to demonstrating four overarching theses: (1) Although the 
Norse influence on English was pervasive, affecting all parts of the grammar, it 
was not deep, except in the lexicon (p. 302); creolization is not required to explain 
the facts of English in the Danelaw. (2) Norse did not stimulate inflectional 
simplification in English, which was in progress well before external influences 
became a factor (p. 303); similarly, Allen (1997), who does see language contact 
as playing a role in the acceptance of internally motivated deflection. (3) With 
respect to French, the linguistic consequences for English amount to nothing more 
than normal borrowing by a substrate language in a situation of occasional bilin- 
gualism. That French did not reach very far beyond the higher domains would 
undercut Dalton-Puffer’s (1995) thought experiment that the transition to Middle 
English is equatable with imperfect (due to unstable bilingualism) communal 
shift away from a “dying” (i.e. recessive) majority language (Old English) to a 
dominant minority language (Norman French). (4) Finally, English is not 
significantly more “foreignized” or simplified than Danish, Swedish, Dutch, or 
northern Low German (Thomason & Kaufman 1988: 321). 

Occupying the middle ground is O’Neil (1978), who distinguishes between the 
simplification of the English morphosyntactic system prior to contact with Norse 
and the neutralization of superficial aspects (e.g. inflection) belonging to two closely 
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related co-territorial systems. Picking up this thread, Dawson (2003) characterizes 
the outcome of the contact situation in eastern and northern England as a koine, 
which was enabled by a close genetic relationship and (hence) typological prox- 
imity between English and Norse. McWhorter (2002; 2005: ch. 11) acknowledges 
that there are no grounds for treating English as a creole but nevertheless con- 
tends that extensive second language acquisition by Scandinavians did in fact 
simplify English grammar to a considerable extent. Similar arguments for heavy 
influence from Irish on English, but not creolization, are put forward in the detailed 
study found in Hickey (1997). 


9 Superposition 


The introduction of Christianity placed Latin in a privileged relationship with respect 
to the metropolitan Germanic languages, if only in specific domains, as hagiolect, 
the language of literacy, and as a quasi-official language of state. The period of 
Latin superposition commenced with the Christianization of Anglo-Saxon England, 
from the beginning of the seventh century, and the establishment of monastic 
culture. It is bracketed by the northern Renaissance of the fifteenth and sixteenth 
centuries, during which Latin served as a language of learned disquisition. 
Latin contributed religious and specialized terminology to the local Germanic 
languages, and it also exerted influences on vernacular written style. With the 
vernacularization of literacy, the Bible, and official documents, which took place 
at different times and under different circumstances in Germanic-speaking regions, 
the position of Latin gradually attenuated. The Roman Catholic church, however, 
continued to use Latin for liturgical purposes and for official purposes. 

The Hanseatic League united Liibeck, Hamburg, Bremen, and other northern 
German cities in an alliance for the purpose of mutual trade and protection. The 
cartel came to dominate mercantile activity in the North Sea and Baltic between 
1250 and 1450. The importance of Middle Low German as a medium of commu- 
nication in the domains of commerce, law, and administration secured its long- 
term prestige and influence in Visby, Stockholm, Kalmar, Bergen, and other 
emerging cities in Scandinavia. Through widespread bilingualism within the 
merchant class and civil bureaucracy and significant German presence in urban 
areas, the Middle Low German imprint on Danish, Norwegian, and Swedish is 
every bit as deep as that of French on English (Thomason & Kaufman 1988: 315; 
Braunmiiller 2000). As the Protestant Reformation got under way in Germany in 
1517, High German eclipsed Low German as an important source of loanwords 
in the north. 


10 Yiddish 


Yiddish represents a case of migration and language genesis, though the details 
of its origin are contested. A long-held view (Weinreich1980) is that from roughly 


Contact and the History of Germanic Languages 423 


1000, Jews speaking Loez (a Judeo-Romance language originating in France) 
settled in German-speaking areas along the Rhine. The newcomers created their 
own ethnoloect through the fusion of Hebrew-Aramaic, Loez, and German. As 
Ashkenazic Jews moved east, starting in the mid thirteenth century, a Slavic com- 
ponent was fused to the vernacular. Weinreich’s model of Yiddish formation has 
given way to a more recent view that posits separate origins for the westernmost 
varieties in the Rhineland (Loter) and fixes the origins of Yiddish further east, 
based on the presence of Bavarian (and to a lesser extent East Central German) 
features in the German component of Yiddish. Wexler’s (1991) hypothesis that 
Yiddish arose via the relexification of Judeo-Slavic by German-speaking Ashkenazic 
Jews is radical and controversial. 


11 Hybridized Mediums of Interethnic 
Communication in the European Metropole 


German gradually spread eastward from the ninth century until its abrupt retreat 
after World War II. One consequence was the creation of German-speaking 
Sprachinseln in Russia, the Czech and Slovak republics, Slovenia, Hungary, and 
Romania (e.g. the Siebenbiirger Sachsen). But another consequence was the genesis 
of transitory restructured forms of German used for interethnic communication. 
Halbdeutsch refers to interlanguage versions of German spoken by Estonians and 
Latvians from the late Middle Ages through the nineteenth century (Stammler 
1922; Mitzka 1923; Lehiste 1965). Mitrovié (1972) describes a German-based pidgin 
that crystalized in Bosnia during the late nineteenth century among workers from 
all parts of the Austro-Hungarian Empire. 

Russenorsk was a seasonal pidgin used by Norwegians and Russians involved 
in the fish trade in northern Norway from the late eighteenth century up until 
the Russian revolution of 1917. Its lexicon is drawn in roughly equal measure from 
the two stock languages, though there are also lexical items from Sami, Low 
German/Dutch, French, Swedish, and English (Broch & Jahr 1984; Jahr 1996: 110). 
Broch (1996) calls attention to an English-Russian trade jargon at Archangel on 
the White Sea during the second half of the eighteenth century. 

In our century countries like Germany, the Netherlands, the United Kingdom, 
Denmark, Sweden, and Norway are still considered “monolingual” but have a 
vast array of languages spoken within their borders. Of particular interest to 
linguists are the verbal repertoires of immigrants. The best-known case is that of 
Foreign Worker German, which is spoken by laborers who have migrated from 
southern Europe and Turkey since the late 1950s. Some studies (Clyne 1968; Gilbert 
& Pavlou 1994) have characterized Foreign Worker German as an industrial 
pidgin on the basis of its structural properties and obstacles to targeted second 
language acquisition due to the ghettoization of its speakers. However, the 
current consensus is that the German of these immigrants has been constitutive 
of a continuum of interlanguage varieties (Klein & Dittmar 1979; Hinnenkamp 
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1984), which is being superseded by ambilingualism in generations born in the 
host countries. 


12 Germanic Languages beyond the 
European Metropole 


During the fifteenth and sixteenth centuries, Portugal and Spain commissioned 
exploratory voyages and the establishment of trading outposts in West Africa, 
the Americas, and Asia. Over the course of the sixteenth century, the enterprise 
became one of political domination, conquest, and colonization. Other maritime 
European nation-states — most importantly, France, England, and the Netherlands 
— and private interests operating therefrom became part of a large-scale movement 
of people and exportation of economic and cultural practices, which transformed, 
if not disrupted indigenous linguistic ecologies. 

From the seventeenth through the nineteenth centuries, England spread its 
language more widely ex patria than any other European imperial power (Hickey 
2004). “Neo-European” and indigenized varieties of English, along with English- 
lexified pidgins and creoles (the latter in colonies that put in place economies 
based on plantation agriculture utilizing nonindigenous slave or in some cases 
indentured labor) are grouped by region — North America, the Caribbean basin, 
Africa, Asia, Australia, and the Pacific - and assigned separate chapters in this 
handbook. 

Despite a far-flung commercial empire extending from North America to 
Indonesia by the middle of the seventeenth century, the Dutch linguistic legacy 
overseas is comparatively modest. In New Jersey and New York vestiges of New 
Netherland (1614—74) Dutch survived into the early twentieth century, including 
a subdialect that was spoken by descendants of slaves. Today, Dutch has official 
status in Suriname, the Netherlands Antilles, and Aruba. There are but three known 
Dutch-lexified creoles in the Caribbean. Negerhollands was spoken on St. Thomas 
and St. John in the US Virgin Islands, which were originally settled by Dutch planters 
and their slaves (in 1672 and 1717, respectively). Its last fluent speaker died in 
1987. Two other Dutch-lexified creoles were developed in Guyana during the 
seventeenth and eighteenth centuries by people living along the Berbice and 
Essequibo Rivers. Berbice Creole Dutch is on the brink of extinction, while Skepi 
Creole Dutch became extinct by 1998. In 1652 the Dutch East India Company estab- 
lished a station at the Cape of Good Hope. Three groups were responsible for the 
formation of a semicreolized Cape Dutch Vernacular during the late seventeenth 
and eighteenth centuries: Dutch, German, and French settlers; the indigenous 
Khoekhoe; and enslaved peoples of African and Asian origin (from 1658). By 
the mid nineteenth century, the sectarian Muslim community in Cape Town had 
developed a tradition of composing religious texts in the Cape Dutch Vernacular, 
using Arabic orthography. Starting in the 1870s, Afrikaner nationalists cultivated 
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their own dialect — which they called Afrikaans — as a written medium and pro- 
moted its use in domains hitherto reserved for English or Dutch. This movement 
culminated in standardization and official recognition in 1925. 

Secondary immigration brought Dutch to the American Midwest (Michigan, 
Wisconsin, Iowa) in the nineteenth and early twentieth centuries, but the language 
gradually fell out of regular use and has largely vanished. More or less contem- 
poraneous Norwegian and Swedish immigrant communities in the United States 
Upper Midwest and Canada have shown similar patterns of shift to English since 
the 1920s. 

Due to emigration, German is spoken in North America, South America (Brazil, 
Argentina, Paraguay), and southern Africa. Germans arrived in Philadelphia in 
large numbers soon after its establishment in the 1680s and fanned out across 
Pennsylvania and then southward into parts of Appalachia, comprising the 
largest non-English-speaking European group by the American revolution in 1776. 
Pennsylvania German remains the home language of roughly a quarter million 
bilingual descendants in sectarian Anabaptist communities in the United States 
(Pennsylvania, New York, Ohio, Indiana, among other states) and Canada 
(mainly southern Ontario). These speakers have retained the Early Modern 
founder dialect (Pfilzisch) of their forebears, suitably koineized and adlexified 
by American English. From the 1840s, massive secondary German immigration 
created German-speaking enclaves in the American Midlands (especially Ohio, 
Wisconsin, and Iowa) and Texas that had their span but have for the most part 
given up the heritage language in favor of English. 

Germany’s bid for an overseas colonial empire was late and short-lived 
(1871-1918). Its linguistic legacy consists of a settler dialect of German still spoken 
as a first language in the former colony of South West Africa (now Namibia) and 
incipient German-lexified pidgins that are attested in Kiautschou and Papua 
New Guinea from the late nineteenth century up to the outbreak of World War 
I in 1914 (see Mihlhdusler 1980; 1984). Residue from the earlier German pidgin 
is preserved in Tok Pisin raus ‘get out’, gumi ‘rubber, tube’, beten ‘worship, pray’ 
(German heraus, Gummi, beten). Unserdeutsch is the autonym for a restructured, 
creolelike variety of German that developed in a mission school and orphanage 
near Rabaul, in Papua New Guinea, and is now endangered (Volker 1991). 


NOTE 


1 The following abbreviations are used in this chapter: 


Go Gothic Olr Old Irish 

ME Middle English ON Old Norse 
MHG Middle High German OS Old Saxon 

OE Old English PGmc_ Proto Germanic 


OHG Old High German PIE Proto-Indo-European 
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21 Contact and the Early 
History of English 


MARKKU FILPPULA 


1 Introduction 


The purpose of this chapter is to describe the main sources of external linguistic 
influences on English in the medieval period and to demonstrate the importance 
of such influences for the development of the language. The focus will here be 
on contact effects in syntax rather than the other domains of language, such as 
lexicon, which has perhaps received the most attention in previous works on 
medieval contacts. For reasons of space, I will mainly concentrate on “mainstream” 
varieties of English, although contact effects can be equally detected in dialectal 
varieties. 

A prerequisite for linguistic contacts with any kind of lasting effects is regular 
intercommunication between speakers of two or more different languages. In the 
periods at issue these contacts took place in different forms under very different 
kinds of sociohistorical circumstances, leading to different kinds of linguistic out- 
come. In order to understand the nature and extent of these, we must begin with 
a brief description of the main potential sources of external influence. The fol- 
lowing discussion follows a roughly chronological order, although it has to be 
borne in mind that the influences deriving from one or the other of the potential 
sources have in most cases spread over long periods of time and in some cases 
have most probably been overlapping temporally. 


2 The Main Sources of Foreign Influence on 
Medieval English 


2.1 Influences from (British) Latin and Celtic 


The earliest external influences on Insular Anglo-Saxon (Old English) are those 
deriving from (British) Latin and British Celtic, which were the predominant 
languages spoken in Britain at the time of the so-called adventus Saxonum in the 


The Handbook of Language Contact _ Edited by Raymond Hickey 
© 2010 Raymond Hickey. ISBN: 978-1-405-17580-7 


Contact and the Early History of English 433 


mid fifth century CE. In the light of recent research it is evident that there had 
been sporadic contacts between the Germanic invaders and the indigenous 
people of Britain, i.e. the British Celts, even before the adventus, but these had not 
led to the kind of large-scale invasions and settlements that followed in the after- 
math of those led by Hengest and Horsa in 449 (see, e.g. Sims-Williams 1983). 
The Germanic invasions and gradually expanding settlements in the next couple 
of centuries after the adventus brought about a situation in many parts of Britain 
which could, by normal standards, be expected to have created particularly intense 
contacts between the British Celts and the Germanic newcomers. However, until 
very recently, the prevailing view on the early medieval history of Britain has 
held that the Germanic invasions led to a rapid and wholesale extermination or 
“ethnic cleansing” of the indigenous British and Romano-British population, 
especially in the southern and eastern parts of the country. Those few of the native 
population who survived this massacre fled to the western and northern fringes 
of the country. This account, sometimes called “Anglo-Saxonist,” goes back to such 
Victorian historians as Freeman (1870) and also entails a belief that the English 
people and their Anglo-Saxon ancestors are somehow “purely” Germanic, with 
virtually no trace of native British elements. 

It is no doubt true that the details and outcomes of the Germanic-British 
encounter varied a great deal from one region to another; this can be best seen 
from the geographical distribution of Celtic-derived river names and _place- 
names, which suggests an almost wave-like expansion of the Germanic popula- 
tion from the east to the west (see, esp. Jackson 1953 on river names; Coates & 
Breeze 2000 on place names). However, modern archeological and historical 
research has shown that the “ethnic cleansing” theory rests on rather tenuous evi- 
dence. For example, the archeologists Lloyd and Jennifer Laing (Laing & Laing 
1990) question it on the basis of their scrutiny of early medieval archeological finds, 
including technology used to produce pottery and other household objects, dif- 
ferent kinds of artistic objects and designs, and metalwork. Their conclusion is 
that the evidence does not support widespread massacre of the Romano-British 
population in either towns or countryside, and that widespread intermingling of 
the two cultures was more likely than sharp polarization and conflict (Laing & 
Laing 1990: 69, 95). 

Large-scale ethnic cleansing is also most unlikely in the light of what we know 
about the demographics of the early contact situation. Estimates of the immigrant 
to native ratio vary from Higham’s (1992) arguably conservative 1:100 to Laing 
and Laing’s (1990) 1:20 or 1:50, and to the most recent estimate of 1:5 by Harke 
(2003). Despite these differences, it is evident that the Germanic immigrants 
formed only a relatively small proportion of the population of Britain in the first 
centuries after the adventus Saxonum. What is even more important in this con- 
nection is that present-day researchers widely agree that the Celtic-Anglo-Saxon 
interface in this period was characterized by a process of acculturation and assim- 
ilation rather than extermination. 

The “acculturation theory” also receives significant support from recent population- 
genetic studies. Capelli et al. (2003) show, first, that there has not been complete 
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population replacement anywhere in the British Isles; secondly, that there has been 
considerable continental introgression in the central-eastern part of England; and 
thirdly, that the data from southern England show evidence for significant con- 
tinuity of the indigenous (i.e. Celtic) population. An essentially similar picture 
emerges from the results of the Oxford Genetic Atlas Project (see Sykes 2006). This 
project collected and analyzed both matrilinear mitochondrial DNA and patrilinear 
Y-chromosome samples of over ten thousand subjects from all over Britain and 
Ireland. The results provide strong evidence for the survival of the Celtic-speaking 
population in Britain and Ireland. 

To sum up on the historical background to the contact situation, we can state that, 
instead of a large-scale replacement of the British and Romano-British population, 
the demographic and sociohistorical circumstances surrounding the adventus 
were such that regular communication and more or less peaceful coexistence 
between the indigenous population and the newcomers were a most likely every- 
day reality in most areas of Britain. This, in turn, means that there must have 
been a period of extensive bilingualism for a considerable length of time after the 
first arrival of the Anglo-Saxons - or rather, trilingualism, as at least some of the 
British Celts probably also had good knowledge of (British) Latin. After all, Latin 
was the only written language in early post-Roman Britain, as Schrijver (2002: 89) 
points out. During this period, the Britons shifted to English and were gradually 
assimilated to the Anglo-Saxon population both culturally and linguistically. Of 
course, the rate of language shift must have varied from one area to another, 
depending on the demographic and other specifics of each area; as mentioned 
above, we can obtain some idea of the time scale of the shift from the river- and 
place-name evidence discussed by Jackson (1953) and Coates and Breeze (2000), 
among others. 

The traditional view on the linguistic outcomes of the earliest contacts between 
the Britons and the Anglo-Saxons can be briefly summarized as follows: apart 
from river names, place names, and some personal names, the indigenous Celtic 
language has left hardly any trace in Old English (OE) or subsequent English, 
for that matter. One of the first and most influential exponents of this view, Otto 
Jespersen (see Jespersen 1905) explained this through the social, cultural, and 
political supremacy of the Anglo-Saxons vis-a-vis the Celts. Though repeated in 
textbooks on the history of English up to the present day, this account has come 
to be questioned in many recent works. Indeed, one can speak of a revival of 
what can be called the “Celtic hypothesis” with respect to some central areas of 
English morphosyntax and even phonology. Raised by such early pioneers as Keller 
(1925), Dal (1952), and Preusler (1956), the possibility of Celtic influence had largely 
been ignored by historians of English, but a new wave of interest began with an 
article by Patricia Poussa (1990), in which she argued for a Celtic contact origin 
for the English “periphrastic do” construction. Despite its rather reserved initial 
reception among Anglicists, Poussa’s study has since been followed by a steady 
stream of fresh research on possible Celtic influence on several other aspects 
of English syntax and phonology. Thus, Raymond Hickey (see, e.g. Hickey 1995) 
discusses what he calls “low-level” phonological contact influences from British 
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Celtic, which may well have affected the pronunciation of OE in certain types of 
contexts and thereby contributed, for example, to the general erosion of OE 
inflections. In a more recent article (Hickey 2002a), Hickey discusses another pos- 
sible area of contact-induced change with a Celtic connection, namely changes 
in the word order systems of OE and Old Irish. In Hickey (2002b) he considers 
the arguments for contact and/or convergence with a number of syntactic 
phenomena including the “northern subject rule” and the progressive. Another 
scholar writing on general typological influence of Celtic on English syntax and 
morphophonology is Hildegard L. C. Tristram (see, e.g. Tristram 1999), who also 
puts the attrition of OE inflections down to contacts with British Celtic. In a 
similar vein, Vennemann (see, e.g. Vennemann 2000; 2001; 2002) discusses a 
number of syntactic features, such as the internal possessor construction, which 
English shares with Welsh and Irish but not with German. The old problem of 
the rise of periphrastic do in English, revived in Poussa (1990), is treated from a 
typological perspective in van der Auwera and Genee (2002), who also note the 
uniqueness of English among Germanic languages with respect to this feature. 
Finally, one could mention the many articles on Celtic influence contained in the 
book The Celtic Roots of English (Filppula, Klemola, & Pitkanen, 2002) and the recent 
monograph on English—Celtic contacts coauthored by Markku Filppula, Juhani 
Klemola, and Heli Paulasto (2008). 

Much of what has been said about the paucity of Celtic influences in OE applies 
to the question of British Latin influence on it. The standard view is that since 
Latin had ceased to be generally spoken in Britain by the time the Roman rule 
there came to an end, there was hardly any direct contact between speakers of 
British Latin and OE, and consequently, little opportunity for Latin elements to 
be transferred to OE. For example, Baugh and Cable state that “[i]t would be hardly 
too much to say that not five [Latin] words outside of a few elements found in 
place-names can be really proved to owe their presence in English to the Roman 
occupation of Britain” (Baugh & Cable 1981: 80). Again, this wisdom has been 
challenged in some of the recent research especially by Peter Schrijver (see 
Schrijver 2002). He has shown that, despite obvious problems in reconstructing 
the exact linguistic detail of British Latin, which was most likely itself heavily 
influenced by substratal Celtic features, there was more British Latin spoken in 
Lowland Britain at the time of the adventus than has generally been assumed. 
Schrijver argues that British Latin also acted as a kind of linguistic go-between 
or buffer in the British Celtic-OE interface and in fact filtered out especially some 
Brythonic phonological features that could otherwise well have been carried over 
to OE (Schrijver 2002: 109). Pending further research on the survival of British 
Latin and its linguistic legacy in OE, we have to be content to note that there is 
so far no evidence of morphosyntactic influences from the direction of British Latin; 
for these, we need to look to Classical and later Latin influences, which mostly 
come into OE and Middle English (ME) along with the later Christianization of 
Britain, starting at the end of the sixth century. In the discussions below, we will 
see that general Latin influence is considered possible for a number of syntactic 
features such as the English progressive and the so-called cleft construction. 
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2.2 Influence from Scandinavian languages 


The main historical events leading to Scandinavian influences on Old and Middle 
English are the invasions of the so-called Viking Period. These invasions were 
carried out by Danish and Norwegian Vikings and, according to the Anglo-Saxon 
Chronicle, started in 787 CE and lasted until 1042. This year marked the end of 
the 25-year reign of a Danish king in England, which had come about through 
the expulsion of A&thelred, king of England, in 1014, and his replacement by the 
Danish king Svein and, shortly after him, his son Cnut. Beginning with small-scale 
plundering forays up until the mid ninth century, the Scandinavian invasions 
gradually escalated into widespread attacks by large armies and eventually led 
to extensive settlements of Vikings primarily in the eastern and northern parts 
of England, the western parts of Scotland, the Isle of Man, and some parts of Ireland, 
too. Apart from large numbers of Scandinavian place names, the extent of 
Scandinavian presence especially in the so-called Danelaw area manifested itself 
in forms of legal procedure and local government, as well as local economy. Initially 
very hostile, relationships between the settlers and the Anglo-Saxon-Celtic 
population turned in the course of time into ones favoring peaceful coexistence, 
intermarriages, and eventual amalgamation of the two populations. This, in turn, 
meant a considerable influx of Scandinavian elements into the English language, 
a process which was greatly facilitated by the linguistic affinity of the 
Scandinavian languages and OE. As in the case of British Celtic, there is little in 
the way of direct evidence for the length of survival of the Scandinavian language 
in the British Isles, but it is generally believed that in some parts of Scotland, it 
survived as late as the seventeenth century. However, in the more southern areas 
it most likely disappeared much earlier. 

As mentioned, the large numbers of Scandinavian place names especially in 
the eastern and northern parts of England and Scotland are living testimony to 
the extent of Scandinavian linguistic influences on English. Textbooks provide long 
lists of names ending in -by, -thorp, -thwaite, -toft, etc.; it is estimated that their 
number amounts to some 1,400 in all (see, e.g. Baugh & Cable 1981: 97). In con- 
trast, the number of common noun loans in OE remains remarkably small, but 
goes up significantly in ME. In this case, the borrowed words relate mainly to 
everyday vocabulary, with many of them being current even at the present 
day. Similarly to the Celtic-English interface, morphosyntactic and phonological 
influences are something to be expected in the type of language shift situation 
that the Scandinavians were involved in soon after they had settled permanently 
in the country. In the former domain, the pronouns they, them, their, both, and same, 
the prepositions till, fro, and though, some inflectional elements (including the third 
person present indicative -s suffix), the present participle ending -and/-end/-ind 
of late OE and early ME, superseded by -ing in later English, omission of the 
relative pronoun in relative clauses (the so-called “contact-clause”), so-called 
“stranding prepositions,” and the uses of the auxiliaries shall and will are usually 
mentioned as probable examples of Scandinavian influence (see, e.g. Baugh & Cable 
1981: 102ff.; Kastovsky 2006: 223). To these could be added several other changes 
which McWhorter (2002: 254) has described as “decrease in overspecification and 


Contact and the Early History of English 437 


complexity” of the grammatical system of English. These include, among others, 
the use of the periphrastic perfect construction HAVE with both transitive and 
intransitive verbs, a development which gained momentum in ME, possibly trig- 
gered by a similar feature of Old Norse (McWhorter 2002: 236ff., 258; Fischer & 
van der Wurff 2006: 141-2). Other possibly contact-induced changes discussed by 
McWhorter (2002) are the loss of inflectional morphology, including marking 
of grammatical gender on the definite article, erosion of V2 word order, and loss 
of “inherent reflexive marking” common in other Germanic languages (as in German 
sich rasieren ‘to shave’, sich beeilen ‘to hurry’). 

In phonology, the evidence is much scantier but it has, for example, been sug- 
gested that the fricativization of kw- or hw- to yv- in words like cwicu ‘quick’ in 
Northumbrian dialects of OE could be due to Scandinavian influence (see, e.g. 
Lutz 1988; Dietz 1989). However, this view has been contested in more recent 
research by Stephen Laker (see Laker 2002), who instead argues for Celtic 
influence on this feature of Northumbrian. 


2.3. Influence from French 


The Norman Conquest and its aftermath marked the introduction of (Norman) 
French as the predominant language of the nobility in England, and this situ- 
ation lasted for some three hundred years. The most profound influences affected 
English vocabulary but left their traces in English phonology, spelling, and to some 
extent, morphosyntax as well. 

As said, the most noticeable linguistic change was the influx of French words 
into English in the late ME period when the speakers of Norman French even- 
tually shifted to English. Standard textbooks such as Baugh and Cable (1981) pro- 
vide ample lists of lexical loans from French, ranging from ecclesiastical words 
to words relating to law, army and navy, fashion, meals, social life, art, learning, 
and medicine. Given the great numbers of lexical loans adopted in the centuries 
following the Conquest, it is not surprising that French influence also left its trace 
on English spelling and even phonology. 

It is more difficult to demonstrate (morpho-)syntactic influences from French 
but some features have been mentioned as having their source in French. These 
include, most notably, some uses of prepositions (e.g. at, by, in) and completely 
new ones based on (more or less) similar French usages (e.g. according to, consider- 
ing, during, excepting) and adverbs (albeit, as, very; for further discussion, see 
Mustanoja 1960: 316ff., 348-9; Baugh & Cable 1981: 166-7). There is less evidence 
of French influence upon “core” syntactic features, but the progressive form 
and the so-called it-cleft construction are among potential candidates because of 
close French parallels. These two features will be discussed in greater detail in 
sections 3.1 and 3.2. 


2.4 Influence from Classical and later medieval Latin 


It is customary to distinguish between three sources of Latin influence on OE: 
(1) continental, i-e., contacts before the Germanic tribes had left their continental 
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homelands; (2) influence of British Latin (discussed in 2.1 above); (3) influences 
brought about by the Christianizing of Britain from the end of the sixth century 
onward (see, e.g. Baugh & Cable 1981: 75; Kastovsky 2006: 220). Of these, the last- 
mentioned phase has left the most noticeable traces in the English language. As 
in the case of contacts with French, vocabulary was the domain of language which 
was the most susceptible to influences from Latin. The standard wisdom has it 
that the introduction of Christianity brought along numerous lexical loans that 
had to do with religion and various kinds of church institutions and religious 
services. Other central areas of life where lexical borrowing occurred were edu- 
cation, books, and learning, but words relating to everyday life, such as articles 
of clothing, household goods, medical terms, animals, and foods, were also 
adopted (Baugh & Cable 1981: 81-91; Kastovsky 2006: 222). Yet, as Kastovsky 
(2006: 222) points out, the numbers of Latin words borrowed in the OE period 
remained relatively small when compared to those adopted in the following 
centuries up until the Early Modern English period. According to Kastovsky, this 
may be explained by the all-pervasiveness of other than direct borrowings, 
namely semantic loans, loan translations, and loan creations (2006: 222). 

In addition to lexical loans, Latin has been argued to have affected the syntax 
of OE as well. The progressive, which was mentioned in the previous section as 
a possible French-influenced feature, according to many scholars has its origin in 
OE texts that are translations from Latin originals, and the same holds for the 
other feature to be discussed in the next two sections, clefting. In addition to these, 
one could mention the so-called “accusative-plus-infinitive” construction (as in 
We believe this to be wrong), which may well be a syntactic borrowing from Latin 
(see, e.g. Fischer & van der Wurff 2006: 193-4), and the loss of the so-called “dativus 
sympatheticus” and its gradual replacement by the possessive adjective from late 
OE onwards (see, e.g. Ahlgren 1946). All in all, however, the syntactic input from 
Latin remained most likely less important than lexical borrowing, which is what 
could be expected on the basis of the superstratal nature of the Latin-OE contacts. 


3 Early Foreign Influences on English Syntax 


This section focuses on three syntactic features which are of particular interest 
from a contact-linguistic point of view as they all manifest the divergence of English 
from its Germanic sister languages, especially German. The features to be discussed 
are: the progressive, the it-cleft construction, and certain types of relative clauses. 
As will be seen, none of these can be satisfactorily explained as endogenous devel- 
opments, or at least their emergence as well as syntactic and semantic properties 
have most likely been influenced by contacts with one or another of the other 
languages mentioned above. 


3.1 The English progressive 


The question of the origins of the English progressive or -ing form (progressive) 
has intrigued scholars from early on and has given rise to an extensive literature. 
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Broadly speaking, one can discern three main positions on the rise of the English 
progressive:' (1) views which consider the emergence of the progressive mainly 
as an endogenous development in English (although possibly reinforced by the 
Latin model); (2) views which emphasize the influence of Latin, Greek, or French 
on the progressive; and (3) views which look to the Celtic (Brythonic) languages 
as the principal source of the progressive. 

Some of the most eminent exponents of the first view are George O. Curme 
(see Curme 1912) and Gerhard Nickel (see Nickel 1966); others subscribing to this 
stand include F. Th. Visser (see Visser 1963-73) and Bruce Mitchell (see Mitchell 
1985). For all of them, the progressive represents an essentially internal develop- 
ment, despite the existence in OE of two distinct constructions that could be con- 
sidered to have provided the basis for the later progressive form — one involving 
the so-called gerund or verbal noun (with the ending -ung/-ing) and the other the 
present participial construction (realized by the ending -ande/-ende). How exactly 
these two constructions have since evolved into the ME and Modern English pro- 
gressives has always proved hard to explain in terms of internal development 
only. Another aspect of the matter that calls for an explanation is the sui generis 
nature of the English progressive as compared with most other Germanic languages 
or dialects. Recent research has shown that various German(ic) dialects have in 
the course of their histories developed periphrastic progressive constructions which 
resemble the English progressive, but with few exceptions (such as the Rhineland 
and some northern dialects of German), these have not been fully grammatical- 
ized in them. This had led, for example, Poppe (2003) to argue that English is the 
only Germanic language in which the periphrastic progressive is based on the 
merger of the formerly distinct participial and prepositional progressives (Poppe 
2003: 75-6). Furthermore, there is a formal difference between the English pro- 
gressive and its putative parallels: the latter are formed with the nominalized 
infinitive (as in Er ist am Lesen lit. ‘he is at-the read’) and not with the verbal noun 
type structure as in English. Whether this suffices to exclude the possibility of 
some degree of convergent developments between the Germanic dialects, includ- 
ing English, remains arguable. One possibility, suggested by Fischer and van der 
Wurff, looks to more than one cause to explain the development of the English 
progressive. These include the early loss of inflections in English (as compared 
with Dutch or German), which in turn contributed to the grammaticalization of 
periphrastic constructions to replace the inflections, as well as the merger in ME 
between the verbal noun in -ung/-ing and the present participle in -ende, to even- 
tually yield ME -ing (Fischer & van der Wurff 2006: 136-7). 

Of the possible external influences on the English progressive, some Latin 
parallels are perhaps the most often cited. For example, Mossé (1938) claims that, 
although the potential for the development of the progressive form was already 
present in OE itself, its rise in OE was triggered by direct influence from Latin 
and Greek. According to Mossé, Latin influence on OE was mainly transmitted 
through the practice of providing interlinear glosses or translations of Latin 
constructions which had no structural parallels in OE. These included, among 
others, the verb ESSE followed by the present (or the past) participle, as in erat 
docens ‘was teaching’ (Mossé: 156; see also Nickel 1966: 268), which could then 
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be rendered into English by the OE auxiliaries beon/wesan ‘be’ followed by the 
present participle. The relative infrequency of the progressive in OE poetical texts 
is according to Mossé a factor which supports Latin influence. Another well-known 
advocate of external influence and of the Latin hypothesis, in particular, is 
Jespersen (1909-49: vol. 4, §§12.1(2)-12.1(3)). 

Whereas the Latin hypothesis can, indeed, be supported by the kind of evidence 
mentioned by Mossé and others, the transition from the two available OE con- 
structions to one in ME remains - once again — a problem for this account. Mossé 
(like some others) favors (more or less) direct continuity between the OE be + 
present participle construction (see, e.g. Curme 1912; Mossé 1938; Mitchell 1985; 
Nickel 1966) and assumes a rather complex and unpersuasive chain of phonetic 
changes to explain the transition from the suffix -inde/-ande to -ing in ME and later 
(1938: 113). The Latin hypothesis is beset with other problems, too. Thus, Nickel 
(1966), who provides a detailed analysis of the typical contexts in which the 
progressive occurred in OE texts, is able to show that even in those texts which 
are clearly affected by Latinate forms, the use of the progressive does not con- 
sistently follow the model of the corresponding Latin forms. On these grounds, 
he concludes that the extent of Latin influence on the English progressive has been 
greatly exaggerated. 

A variant of the classical language hypothesis is the “Romance language 
hypothesis,” the main exponent of which is Einenkel (1914). According to him, 
the French gerundial-participle construction involving the suffix -ant provided the 
crucial stimulus for the English gerund and the later progressive construction. 
Einenkel’s account was specifically aimed at refuting Curme’s position, which, 
as was noted above, looked to purely native origins for the English gerund. 
However, the major weakness of Einenkel’s argument is, as Dal (1952: 31) points 
out, that it fails to explain how and why the English progressive eventually came 
to be based on the -ing form and not on the OE present participial forms ending 
in -ende/-ande, which would have been the expected development, given the 
formal similarity with the French form. This is not the case, however, as it is 
precisely in the ME period that the old participial endings are replaced by the 
ending -ing. Visser (1963-73), who sees the progressive as a primarily endoge- 
nous development, suggests “selective” influence from French: the French model 
is according to him particularly relevant for those ME -ing constructions which 
were preceded by the preposition in, emulating thus the French pattern en 
chantant ‘in/while singing’ (Visser 1963-73: §1859). 

Finally, we turn to the possibility of Celtic influence, and indeed, there are 
several factors that speak for Celtic contact influence on the English progressive. 
First of all, the Celtic languages (especially Brythonic) have from early on had 
close parallels to the English progressive in the form of constructions involving 
the verb ‘be’ followed by the so-called verbal noun (see Mittendorf & Poppe 2000; 
Filppula 2003). In fact, they are structurally closer than, for example, some of 
the Germanic constructions that have sometimes been adduced to show that the 
progressive is of purely Germanic vintage. As noted above, the latter are formed 
with the nominalized infinitive and not with the verbal noun type structure as in 
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English and the Celtic languages (Vennemann 2001: 356; Filppula et al. 2008: 69). 
Secondly, the chronological precedence of the Celtic constructions is beyond any 
reasonable doubt, which is of course a prerequisite for contact effects from this 
direction (Mittendorf & Poppe 2000). Thirdly, apart from the shared structural 
features there are significant similarities in the semantic and functional proper- 
ties of the English progressive and its Celtic parallels, with these properties 
centering around the notion of imperfectivity (Mittendorf & Poppe 2000; Poppe 
2002). A fourth piece of indirect but significant evidence is that the progressive 
is more frequent and has a wider range of uses in so-called “Celtic Englishes” 
than in other present-day British Isles Englishes, especially with stative verbs, after 
auxiliaries (esp. will/Il, would/d), and in expressing habitual meaning (Mossé 
1938; Braaten 1967; Filppula 2001; 2003; Paulasto 2006). This demonstrates the 
susceptibility of this area of grammar to contact influences. 


3.2 The cleft construction 


Compared with the progressive, rather little has been written on the possible 
contact background of the English cleft construction (CC). Yet there can be no 
doubt that the CC - both in its so-called it-cleft and pseudo-cleft forms — is an 
innovation in OE, and again, a feature which marks a clear difference between 
English and German, and from a wider cross-linguistic perspective, between English 
and so-called Standard Average European, as defined, e.g. in Haspelmath (1998). 

To begin with the earliest history of the CC in English, detailed research by 
especially Ball (1991) shows that the frequencies of the CC remained very small 
in the OE period. She also notes that it is hard to find instances in OE texts that 
would match exactly the Modern English it-cleft with a “specificational” reading 
and the “dummy” subject it. Indeed, Mitchell (1985: §1486) states that there are 
no such examples in OE. According to him, the same effect is achieved in OE by 
simply putting the element to be emphasized in initial position without clefting. 
Mitchell does, however, acknowledge Visser’s (1963-73: §63) examples with pet 
as anticipatory pronoun in cleft-type constructions, but evidently does not con- 
sider them proper instances of it-clefts. Visser himself (1963-73), whose paradig- 
matic example here is It is father who did it, lists examples of cleft constructions 
such as (1) from OE onward, noting that in OE introductory hit is sometimes omit- 
ted or, in some cases, replaced by pet (1963-73: §63). 


(1) peet wees on pone monandeg . . . pet Godwine mid his scipum to sudgeweorce 
becom (Anglo-Saxon Chronicle, year 1052; cited in Visser 1963-73, §63: 49). 
‘That/it was on that Monday ... that Godwin came with his ships to the 
south fortress.’ 


Visser finds, however, more examples of clefts in ME and later texts, which indi- 
cates the gradually increasing use of the cleft construction over the centuries. In 
some contrast with these findings, my own research (see Filppula 2009) based on 
The York-Toronto-Helsinki Corpus of Old English Prose and The Penn-Helsinki Parsed 
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Corpus of Middle English (second edition) revealed that, though rather rare and 
sometimes open to different interpretations, clefting had a greater presence in the 
grammar of OE than has been assumed in many previous works. The OE corpus 
also contained several instances, such as those in (2) and (3), which were both 
functionally and structurally similar to Modern English clefts. In ME, clefts 
continue to diversify both syntactically and functionally, as already shown by Visser 
and others and also documented in Ball (1991) and Filppula (2009). 


(2) [...] hit were Swydun se de hine lerde mid pere halgan lare and pone Ge he 
geseah on dzare cyrcan swa feegerne (A:lfric’s Lives of Saints (ed. Skeat 1966): 
388.4472) 

‘It was Swithun who had provided him with the holy teaching and whom 
he had seen so beautiful in the church.’ 


(3) Pa cweed Iohannes to Petre. pet hit were se helend pe on dam strande stod; 
(A.lfric’s Catholic Homilies (ed. Godden 1979) II, 17:164.118.3653) 
‘Then said John to Peter that it was the Savior who stood on that strand;’ 


To turn next to the possible role of cross-linguistic influences on the rise of 
clefting in English, one could start off with Wagner’s (1959) account of the areal 
dimension of the cleft construction. He observes a “geolinguistic connection” (ein 
sprach-geographischer Zusammenhang) between the French mise en relief construc- 
tion (c’est-clefting) and its Insular Celtic parallels. In Celtic languages and in French, 
as he continues, the mise en relief construction is firmly embedded in the gram- 
matical system and is closely connected with other systems, most notably those 
for forming questions in both of these (groups of) languages (Wagner 1959: 
173ff.). Although Wagner does not comment on the rise of the English cleft 
construction in this connection, he draws attention to the frequent use of clefts 
in Hiberno-English (ie. Irish English), which according to him depends on the 
corresponding Irish usage. This tendency is also noted by Visser (1963-73: §64) 
and has been confirmed in more recent studies, such as Hickey (1999) and 
Filppula (1999). 

Ahlqvist (1977) is another scholar to take up the idea of an areal connection 
between those western European languages that have grammaticalized the CC 
(the Celtic languages, English, and French). He suggests that the CC in these 
languages may ultimately derive from Celtic where it was attested earlier than 
in any other of these languages. More recently, Wehr (2001) discusses evidence 
that points to the existence of a “westlich-atlantischer Sprachbund,” involving the 
Celtic languages, French, and Portuguese as its “extreme types,” while English 
represents a more “moderate” type. A central diagnostic feature of this Sprachbund 
according to Wehr is what she describes as the “weakening of the individual word,” 
a tendency which involves several phonological processes such as sandhi phe- 
nomena, enchainement, liaison, elision, and fusion. It is this loss of autonomy of 
the individual word which then explains the prominent status of clefting in these 
languages. English is an interesting “halfway house” in this respect: since it pre- 
serves the possibility of “word accent,” it can use prosodic means for emphasis, 
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yet it has developed the CC, unlike German. This fact alone leads one to consider 
the possibility that the rise of the CC in English is due to external influences. 

Indeed, German (2003) suggests that the English CC has “a possible French 
adstratal or even a Celtic substratal origin.” He favors the Celtic hypothesis because, 
first, the English CC bears a closer resemblance to the corresponding Brythonic 
construction than the French one. Secondly, just as in Irish English and some other 
“Celtic Englishes,” the CC is a striking feature in Breton French, marking it off 
from standard French. French influence is also discussed but dismissed by 
Ball (1991) because of the earlier attestation of pronoun-foci clefts in English. 
Ball also questions any major Latin influence on the (Old) English CC because of 
the rarity of Latin-derived patterns in OE; furthermore, she notes that OE authors 
sometimes seem to avoid them in their translations of Latin texts (see Ball 1991: 
52 for discussion and examples). 

The Celtic hypothesis thus seems to offer the most cogent explanation for the 
rise of the CC in English. The main factors speaking for Celtic contact influence 
on English clefts can be summarized as follows: 


1 Cleft constructions are attested in English significantly later than in the Celtic 
languages. 

2  Clefting is robust in even the earliest stages of the Celtic languages, probably 
going back to continental Celtic (Gaulish). 

3 English and Celtic share other syntactic features that separate them both from 
languages such as German which are in the nucleus of Standard Average 
European. These features include so-called “internal possessors” and identical 
forms for intensifiers and reflexive pronouns. 

4 Cleft constructions are both more frequently used and syntactically more 
versatile in present-day (and earlier) Celtic-influenced varieties of English than 
in other British Isles Englishes, including Standard English. 


However, strong as the case for direct Celtic influence appears to be, the avail- 
able evidence makes it hard to rule out some degree of mutually reinforcing, 
adstratal, influences especially between the late medieval and later stages of English 
and French. This, if vindicated by further research, would be in line with 
Wagner’s aforementioned idea of a “geolinguistic connection” between English 
and the westernmost continental languages (Wagner 1959: 173ff.). 


3.3 Relative clause structures 


This section deals with certain types of relative clause structures, the origins of 
which have been the subject of many debates both in the earlier and more 
recent literature. The first of these is the so-called “zero relative” construction or 
“contact-clause,” as it is also often called. Closely associated with zero relatives 
are two other, partially overlapping, phenomena, namely “resumptive pro- 
nouns” and “preposition stranding.” Resumptive pronouns are pronominal and 
anaphoric reflexes of the antecedent in the subordinate clause. They are not part 
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of present-day Standard English syntax, but examples occur in some regional 
dialects, e.g. That’s the chap that his uncle was drowned, recorded from Welsh 
English (Parry 1979: 146). Preposition stranding, then, occurs in prepositional 
relative clauses where the preposition can be left “hanging” or “stranded” at the 
end of the relative clause, e.g. the rock we sat down on (cited in Isaac 2003a: 47). 
As in this example, the relative element is often suppressed, especially in speech. 

The contact-clause is a well-known feature of OE. Thus, Visser (1963-73: §18) 
speaks of “apo koinou” constructions, his paradigmatic example being I have an 
uncle is a myghty erle. According to him, they occurred here and there in OE texts 
but were “not frequent.” In ME and Early Modern English, by contrast, their 
frequency is “considerable,” but in later Modern English the contact-clause 
becomes “archaic” and “dies out” in present-day English (1963-73: §18), except 
in present-day dialectal English, especially Anglo-Irish (§21). Visser’s account 
is confirmed by Traugott (1992), who also considers the contact-clause to be 
“relatively rare” in OE. She takes it to be a native construction because of its 
appearance in the earliest poetry and even in translations of Latin texts where 
the original has an overt relativizer (Traugott 1992: 228). She illustrates the latter 
type of context with the following OE example: 


(4) and segdon him 6a uundra dydese  helend 
and told them those wonders did that Savior 
[Lat. ‘et dixerunt eis quae fecit iesus’] 
(Traugott 1992: 228) 


Traugott further notes that the contact-clause is “usually found in relative clauses 
with predicates such as hatan ‘to call, name’, wesan ‘to be’, belifan ‘to remain’, nyllan 
‘to not want’, verbs that are either stative or are used statively in the construc- 
tions under discussion” (1992: 228). What are also of particular interest in this 
connection are her observations on the occasional use of resumptive pronouns in 
OE writing (1992: 229). These are seemingly repetitious pronominal reflexes of 
the antecedent in the subordinate relative clause and occur almost always with 
the relativizer pe, but also occasionally with pet. The relativized NP can be in an 
accusative, dative, or genitive form, as in (5), (6), and (7), respectively: 


(5) and ic gehwam  wille perto  tecan pe hiene (ACC) 
and I whomever shall thereto direc PT him 
his lyst ma to witanne 
of-it would-please more to know 
‘and I shall direct anyone to it who would like to know more about it’ 
(extract from Orosius, cited in Traugott 1992: 229) 
(6) Swa bid eac pam treowum pe him (DAT) gecynde 
So is also to-those trees PT to-them natural 
bip up heah to standanne 
is up high to stand 
‘so it is also with trees to which it is natural to stand up straight’ 
(extract from Boethius; cited in Traugott 1992: 229) 
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(7) Se wes Karles sunu pe AEpelwulf West Seaxna cyning 
That was Charles’ son PT Aibelwulf West Saxons’ king 
his dohtor _—_heefde him to cuene 
his daughter had for-himself as queen 
‘he was the son of Charles whose daughter was the queen of Athelwulf, 
King of the West Saxons’ 
(Anglo-Saxon Chronicle, cited in Traugott 1992: 206) 


As mentioned above, both the contact-clause and resumptive pronouns continue 
to be used in ME. An interesting feature of the ME relative clauses is the more 
frequent deletion of subject-relatives as compared with object-relatives in ME texts. 
According to Mustanoja (1960: 205), this indicates a later development of the 
latter type of deletion phenomenon. He also pays attention to the more common 
use of the contact-clause in poetry than in prose (1960: 205). Fischer (1992: 306) 
confirms Mustanoja’s observations on the commonness of zero relatives in subject 
position in both early and late ME texts. 

It was not until the wh-pronouns, which were capable of indicating case, 
had developed that resumptive pronouns gradually disappeared from standard 
language. As Fischer puts it, they were no longer needed to fill a “systemic gap” 
(1992; 309}. 

In contrast with Traugott’s (1992) and Fischer’s (1992) position, other than endo- 
genous sources have also been put forward in the literature. For example, Hamp 
(1975) suggests that the development of the English contact-clause “may be put in 
strikingly direct relation with certain configurations of Medieval Welsh surface 
structure” (Hamp 1975: 299). He refers to Bever and Langendoen (1972), who accord- 
ing to Hamp “cannot explain why German, unlike English, cannot delete Rel; nor 
why OHG and OSaxon could” (Hamp 1975: 299). He himself explains deletions 
in the latter two by “rules inherited from Germanic grammar,” but the later English 
deletions by Welsh influence. 

Contact influence from Welsh is supported by the fact that in both earlier and 
present-day Welsh, deletion of the relative pronoun is very common. As Evans 
(1964/1989: 60) states, a relative pronoun is present only in affirmative clauses 
where it functions as subject or as object of the relative clause. Even in these, the 
pronoun may be omitted before oed ‘was’, as in e guvyr oed en e grogi ‘the men 
[who] were hanging him’ (1964/1989: 61). In negative clauses, no form of the 
relative pronoun is used, and the same holds for a number of other contexts, such 
as before compound verbs containing certain prefixes or where the verb is pre- 
ceded by certain negative or preverbal particles (1946/1989: 61-3). In the Welsh 
grammatical tradition, these clauses are called “proper relative clauses,” as 
opposed to “improper relative clauses,” which express a genitival or an adver- 
bial relationship (either with or without a preposition), or in which the relative 
element is a nominal predicate (Evans 1964/1989: 60, 64). The improper relative 
clauses have no relative pronoun, but the verb is preceded by the particles yt, 
y(d), ry/yr (affirmative), ny(t), na(t) (negative). As Evans points out, these particles 
gradually came to be felt as “relative conjunctions” (1946/1989: 64). Improper 
relative clauses involve what we have above labeled as resumptive pronouns. 
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In a genitival clause, this is a possessive pronoun, as in Evans’ example y brenhin 
y kigle f... y glot a’e volyant ‘the king whose fame and renown I have heard of’ 
(lit. ‘...I have heard of his fame and renown’). In adverbial prepositional 
clauses, the resumptive pronoun or element consists of a conjugated form of 
the preposition, as in y coedyd y foassant vdunt ‘the woods to which they fled’ 
(lit. “. .. they fled to them) (see Evans 1964/1989: 65-7). 

A further feature of Welsh relative clauses is the phonological process of lenition, 
which is shared by the relative clauses of the other Brythonic languages, as 
Thurneysen (1946/1975: 323) points out. Lenition occurs where the antecedent is 
the subject or object of the relative clause and where the verb of the relative clause 
is preceded by a leniting particle a (Thurneysen 1946/1975: 323). In Old Irish, 
lenition is obligatory in subject relative clauses but optional in object clauses 
(1946/1975: 314). What makes lenition important in this connection is the fact that 
it is, as Hamp (1975: 300) points out, closely associated with the deletion of the 
relative element or particle. This has become a prominent feature of what Hamp 
calls Welsh “object syntax,” leading to a close phonetic similarity between the lenited 
object noun and the lenited verb with the deleted relative particle a. To a bilin- 
gual Welsh—English speaker, “suppression of an overt Rel segment had a strong 
linkage with non-subject syntax,” thus explaining why Rel deletion is a regular 
feature of object relative clauses (Hamp 1975: 300). 

Even a brief description of the properties of the Celtic contact-clauses makes it 
evident that the relative clause systems of the Celtic languages, especially those 
of Welsh, could have provided the model for the English contact-clauses. Indeed, 
zero relatives have been argued to have a Celtic background, e.g. by Preusler (1956: 
337-8). Relying on Jespersen (1909-49: vol. 3, §7.1.2) and Kellner (1892/1905: §111), 
he notes the rapid increase in the use of zero relatives in English from the thir- 
teenth century onward. However, contrary to the position adopted by Jespersen, 
Preusler rejects the possibility of Scandinavian influence, because the same 
developments take place in Scandinavian languages at about the same time as in 
English and, hence, too late to have triggered the same process in English. What 
according to Preusler suggests Celtic influence, is the fact that, in Celtic (Welsh), 
relative deletion can occur regardless of whether the antecedent is in the nomi- 
native or accusative. This is the situation in earlier English, too, whereas in 
Modern English nominative relative deletion is much more restricted. 

Preusler’s substratum account has in recent research been taken up by Tristram 
(1999), who discusses it briefly under two headings: “hanging prepositions” (i.e. 
“preposition stranding”) and “zero relatives.” As regards the latter, Tristram is 
content to point out the possibility of relative deletion in both English and Welsh, 
although she notes the difference between the two insofar as deletion of subject 
relatives is concerned. Tristram’s discussion of preposition stranding is equally 
brief: referring to Preusler and to Molyneux (1987), she outlines the systems of 
preposition stranding in English relative that-clauses and compares these with Welsh 
where, in contradistinction to English, the stranded element is a prepositional 
pronoun inflected for gender and number and therefore stressed. Tristram does 
not, however, pursue the matter of possible contact influences beyond these 
observations. 
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What also supports indirectly Celtic influence on English relatives is the fact that 
some present-day dialectal varieties of English make frequent use of resumptive 
pronouns, which are clearly reminiscent of the Celtic (Welsh) “improper relative 
clauses.” Thus, The Survey of Anglo-Welsh Dialects (SAWD) records examples such 
as That’s the chap that his uncle was drowned (questionnaire item IX.9.8) at two loca- 
tions in Dyfed/Cardiganshire (see Parry 1979: 146). Commenting on this usage, 
Parry compares it with Welsh constructions of the type Dyma’r dyn y canodd ei fab 
yn y cor, literally “This is the man that his son sang in the choir.’ Similar phenomena 
occur in Irish English and also Scottish English, and they too have close parallels 
in the relevant Celtic substrate languages (see for instance Filppula 1999 and Hickey 
2007 on Irish English; and Miller 1993 on Scottish English). 

However, there is so far no consensus about the role of the Celtic languages in 
the development of the English relative clauses. For example, in his discussion 
of the origins of relative clauses with preposition stranding de la Cruz (1973) first 
notes the existence of a “complex of isogloses” (sic!) in the northwest of Europe, 
comprising syntactic features shared by English, Scandinavian, and Celtic. Apart 
from relative clauses with preposition stranding, which are a feature of all three 
groups of languages, these features include infinitival structures with stranding 
prepositions such as Norwegian Jeg hadde ingenting 4 lete etter i skapet, paralleled 
by the English I had nothing to look for in the cupboard and the Irish Nil leabhar agam 
le caint faoi ‘I have no book to talk about.’ These may be compared with the German 
structures which do not involve stranding prepositions: Ich habe keinen Fiiller, um 
damit zu schreiben ‘I have no pen to write with’; Dies ist ein gutes Haus, um darin zu 
wohnen ‘this is a nice house to live in’ (de la Cruz 1973: 175-6). De la Cruz then 
goes on to point out that English and Scandinavian, but not Celtic, have developed 
further to allow reinterpretation of the object of the preposition as subject of a 
passive sentence. An example from Swedish is Jag forsikrar Er att jag kan riiknas 
pa, which has an exact parallel in the English I assure you that I can be relied upon. 
This “passive transformation” is not, however, found in Celtic, which is why de 
la Cruz posits an “Anglo-Scandinavian isogloss” separating Celtic from “Atlantic 
Germanic,” excluding Icelandic, which in his words “has not advanced as far as 
English and the other Scandinavian languages” (1973: 177). De la Cruz concludes 
that since only relative clauses and infinitival structures with stranding prepositions 
but not stranding prepositions in passive constructions have direct counterparts 
with Celtic languages, it is “hard to indulge in the temptation of assigning a Celtic 
origin to the English and Scandinavian structures” (p. 172). Instead, he is content 
to interpret the “Anglo-Scandinavian isogloss” as a “most advanced development 
in the most Western corner of Europe,” that is, as being internally motivated 
(p. 172). De la Cruz’s argument against Celtic influence is, however, untenable as 
Irish does not have a passive similar to English or Scandinavian, but a verb form 
which is not marked for person. It does not therefore make sense to speak of Irish 
not having a “passive transformation.”* 

Another challenge to the Celtic hypothesis and to Preusler’s (1956) account, in 
particular, is presented by Poppe (2005), who provides a detailed discussion of 
both Welsh relative clauses and their Germanic counterparts. He points out the 
role of lenition as a formal marker of subordination in Welsh even in those cases 
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in which the relative marker a has been elided. This he takes to be a crucial dis- 
tinguishing feature between the English contact-clause and the Celtic relatives and 
something which in his view would have worked against transfer in a contact 
situation. From a general contact-linguistic point of view, however, it is hard to 
see how a phonological feature like this (although syntactically conditioned) could 
have precluded syntactic contact effects which could be assumed to have been 
more dependent on the overt presence — or absence, as in this case — of the relative 
element itself. Besides, being simplificatory in nature, deletion of the relative 
pronoun would accord well with what usually happens in contact situations. 

A potentially more serious objection raised by Poppe concerns the date of 
emergence of the contact-clause in English and the existence of “pan-Germanic” 
parallels to it. As for the former question, Poppe suggests that “the decisive phase 
of interference [from Welsh] would postdate probably at least 1200,” which he 
takes to mark the approximate dating of the elision of relative markers in medieval 
Welsh itself. However, from the literature on Middle Welsh one can gather that 
relative deletion was by that stage an established feature of at least the spoken 
varieties, if not of the written language, and hence, could well have been present 
even in the earlier stages (see, e.g. the discussion in Evans 1964/1989). Also, the 
“improper” relative clauses with resumptive pronouns probably date back to Old 
Welsh, as Isaac (2003b: 93) writes. 

Turning next to the possible Germanic roots of the contact-clause, Poppe first 
notes, following Ebert (1978), that “asyndetic relative clauses” (which is the term 
used by Ebert) are a rare feature of both the oldest stages of English and of 
Scandinavian languages but become more frequent in their later histories. Poppe 
finds further support for the Germanic origins especially in the work of Dekeyser 
(1986). According to Dekeyser, the subject contact-clause (but not the non-subject 
ones for which Dekeyser proposes a different origin) arose in OE and can be seen 
as ‘an offshoot of a much wider phenomenon inherent to all the “primitive” 
Germanic dialects’, which he describes as ‘the Old Germanic asyndetic parataxis 
without an overt subject’ (Dekeyser 1986: 112-13). Dekeyser further states that 
this feature was later lost in German and Dutch, but was grammaticalized in English 
and the Scandinavian languages. As regards the origin of the non-subject contact- 
clause, which according to Dekeyser was “extremely rare” in OE, his suggestion 
is that it was due to “the introduction of a new relativization strategy with a deletable 
that and fixed word-order” (1986: 109, 115). Dekeyser does not, however, comment 
on the possible factors affecting the rise of this new strategy, apart from stating 
that it coalesces with the introduction of that as a relative marker and also with 
the stabilizing of the SVO word order in Early Modern English (1986: 114). 

Dekeyser’s (1986) (and Poppe’s) account can be queried on the basis of the 
ambiguous nature of the evidence from Old Germanic. It is not at all clear how 
the putative early Germanic relative structures should be interpreted: are they depen- 
dent clauses, and thus “genuine” instances of relative structures, or independent 
clauses, juxtaposed to each other in asyndetic parataxis? If the former is the case, 
it would explain why the contact-clause appears as early as in OE. Yet, what would 
remain unexplained on this account is the gradual increase of this type of relative 
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clause in later English, as opposed to German or Dutch, which lose it over time 
— not to mention the extension of the contact-clause to non-subject relatives in 
ME. Even under this scenario, then, English undergoes a clear typological change 
which distances it from its Germanic neighbors, which brings us back to the ques- 
tion of Celtic influence as a factor promoting such change. Needless to say, if the 
alleged Germanic parallels are not to be considered relative structures on a par 
with those found in OE or Celtic, that would enhance the probability of early Celtic 
influence on the English contact-clause. 


4 Conclusion 


Historical linguists have until quite recently been accustomed, and indeed, trained 
to consider contact-induced change in all domains of language except perhaps 
the lexicon as something of a “last resort”; language contacts are accepted as an 
explanatory factor only if explanations in terms of language-internal factors fail 
to yield satisfactory results. This may well be due to the legacy of structuralist 
linguistics, which treats language as a system oti tout se tient, and which seeks to 
explain language change by system-internal factors alone (cf. Gerritsen & Stein 
1992: 5-6). There should not, however, be any principled basis for the primacy 
of language-internal factors, and our aim should always be to find the best and 
most plausible explanations, whether internal or external. The present chapter 
has sought to redress the balance in historical-linguistic studies of English by 
examining three syntactic features which cannot be properly explained without 
considering the sociohistorical background of their emergence and development. 
In all three cases this has meant looking further afield to corresponding develop- 
ments in the syntax of the neighboring languages both in the British Isles and on 
the continent. More specifically, this kind of areal and typological approach has 
underlined the distinctive and “un-Germanic” nature of English with respect to 
these features. The flip side of this is the striking resemblance between English 
and the various Celtic languages, which have mostly been disregarded in tradi- 
tional accounts of the history of English. The fact that recent research has brought 
to light several other structural similarities between English and Celtic (see 
Filppula et al. 2008 for a detailed discussion) serves to highlight the central role 
played by language contacts in shaping the grammatical system of English. 


NOTES 


The argumentation presented here stems from the work carried out under the auspices of 
the research project English and Celtic in Contact, which was supported by the Academy 
of Finland (Finnish Academy Project no. 47424). The research was carried out jointly by 
Professor Juhani Klemola, Dr Heli Paulasto, and myself. However, I alone remain respon- 
sible for the exact formulation of the views expressed in this article and for all possible 
shortcomings. 
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1 Note that the term “progressive” is used here as a convenient cover term and is not 
meant to be tied to any single semantic notion. In some other works, the more neutral 
term “expanded form” is preferred (e.g. Nickel 1966). 

2 lowe this point to Raymond Hickey (p.c.). 
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22 Contact and the 
Development of 
American English 


JOSEPH C. SALMONS AND 
THOMAS C. PURNELL 


Writing in American Speech in 1929, E. C. Hills took up Meillet’s “theory of 
a linguistic ‘substratum’ that underlies and modifies or has modified certain 
languages” for English as spoken in the United States. He argues against any 
significant role for contact in the development of American English: 


The English poured into what was virtually an empty land. The few nomadic 
Indians were pushed back and sequestered in “reservations.” The Indians have 
given to English a few words, but otherwise they have had no effect whatever on 
the English language spoken in America. They have not created in any respect a 
linguistic substratum. 

In perhaps one fiftieth of the United States there are linguistic substrata. These are 
formed by the French in parts of northern New England and Louisiana, the Spanish 
of the Southwest, the small German colonies in Pennsylvania, and the negroes in 
some districts in the Southeast. Elsewhere English has no substratum in the United 
States. It is true that in certain large cities there are recently arrived colonies of 
people of non-English speech, such, for instance, as the Italians of San Francisco and 
the Poles of Chicago. But in these colonies those of the second generation speak English 
with little or no foreign accent and those of the third generation generally lose the 
foreign speech completely. It could not be otherwise with the extreme mobility of 
our population and the great economic pressure that is put on our immigrants to 
learn English. (Hills 1929: 432) 


If we leave aside the flawed, anachronistic, and offensive views of American demo- 
graphic history in this quote from Hills, his denial of substrate influence in 
language change is part and parcel of a (still) widespread skepticism toward sub- 
strate varieties. Such views are driven in no small part by the abuse of language 
contact as an explanation for change generally and substrates in particular as deus 
ex machina. At its worst, scholars have appealed to substratal explanations where 
there is no evidence of the alleged substrate language (Hock & Joseph 1996: 387; 
Trask 2000: 329). Overwhelmingly, as a result, work on the history of American 
English has tended to bracket out contact as an issue.’ 
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Even today, American English can seem striking for its homogeneity at the 
national level. Indeed, even with media attention to the evidence to the contrary 
(e.g. Sultan 2006), most Americans assume that regional variation is modest 
and receding. The present analysis of language contact in the development of 
American English, as will become clear shortly, is undertaken at a local level, often 
meaning both a limited geographical area and some identifiable social group. 
Numerous scholars argue that demarcative differences occur on this level, espe- 
cially in the realm of structural language-contact effects beyond the lexicon. Even 
at the regional level, we will argue here, the effects of contact on many varieties 
of American English may be modest, even subtle, on first glance, but they do exist 
in ways that are interesting and important for understanding American language 
and society today (see also José 2007). 

We discuss possible structural impacts on the regional English of monolingual 
Americans that are likely to have originated in other languages and, to a lesser 
extent, in other English dialects. We first lay out some conceptual and theoretical 
preliminaries (section 1), and touch briefly on structural effects of lexical borrowing 
(section 2). With that, we turn to ethnolects in the US that reflect the linguistic 
heritage of particular groups (section 3). That sets up a more extended case study 
of the Upper Midwest (section 4), where we see features originally associated with 
the most widely spoken immigrant languages establishing themselves as regional 
markers. Finally, we provide a bigger picture with concluding remarks (section 5). 
From its beginnings, we argue, English spoken in the present-day United States 
has been forged by language contact to a greater extent than is widely appreciated. 


1 Background 


One major challenge for the current topic is determining what is and is not “contact- 
induced” change. On the one hand, as already suggested, traditional historical 
linguists have typically preferred “internal” accounts of change unless there is 
compelling evidence that something is contact-induced. On the other, some now 
see essentially all language and dialect change as driven by contact, usually in 
the sense that the spread of change involves diffusion. The former view rests on 
a dichotomy between “internal” and “external” motivations, where structural and 
contact-related changes are rigidly separated. That view appears to be incorrect 
(Rickford 1986; Mufwene & Gilman 1987; Dorian 1993; Thurgood 1996). We aim 
to avoid “the weakness of simplistic dichotomous thinking” (Dorian 1993: 152) 
and work to trace how “internal” and “external” factors interact in change. (The 
nature of the interplay differs depending on the type of contact and varieties in 
contact.) The latter view — where all change arises from contact — risks trivializ- 
ing the notion of “language contact,” rendering it vacuous (see Labov 2001: 20). 
We can avoid this problem by placing analytical importance on both the ultimate 
sources of features and the transmission of features through communities. Both 
play central roles below. 

Turning to terminology, we take Hills’ term substrate as a cover term, referring 
to the residues of language shift, where an adult’s first language (L1) influences 
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L2 acquisition (e.g. Mufwene 2001). Some argue that substrate effects are motiv- 
ated not by social identity but by structural accommodation in the L2 whereby 
language learners engage in relexifying portions of their native lexicon through 
transfer of frequent and perceptually salient or congruent patterns (Trudgill 
1986; 2008; Mesthrie 2001). 

Speaker age naturally plays a role in the transmission of features, whether under 
contact or not (Labov 2001: 415-45). Under the apparent-time hypothesis (cf. Labov 
1966), linguistic differences between speakers of various ages collected at one point 
in time reflect different eras of acquisition, making change in progress readily 
observable. Longitudinal studies across speakers’ lifecycle show subtle real-time 
changes even among adults, for example the Canadian [M ~ w] merger (Chambers 
2002: 357) and Montreal French shift of apical > dorsal variants of /r/ (Sankoff 
& Blondeau 2007). We assume that early acquisition follows from input of care- 
takers and peers, with reshaping along emerging lines of social identity later with 
only limited flexibility. Substrate influence, thus, is most likely to occur with adult 
acquisition of the new language or dialect, and a cohesive community involving 
enough adults to help shape the grammars of a generation of L1 learners. 

Three aspects of contact theory are most helpful in understanding how source 
features have been transmitted into American English: imposition (ultimately, 
source features showing up in the developing variety, but see below for a fuller 
description), the process of koineization (variety formation via leveling and real- 
location), and timing (a post-immigration delay allowing for leveling stabilization). 


1.1 Imposition 


A more nuanced alternative to “substrate” is “imposition” (van Coetsem 1988; 
2000; Howell 1993; Winford 2005), arising in situations where a group brings L1 
features into their L2 repertoire, features which are, in turn, adopted by native 
speakers of the L2 as part of the broader speech pattern. Much structural inter- 
ference can be characterized as “imposition” resulting from this imperfect L2 acqui- 
sition.’ As van Coetsem puts it, “the source language speaker is the agent, as in 
the case of a French speaker using his French articulatory habits while speaking 
English” (1988: 3, also in Winford 2005: 376). This focus on “agentivity” — where 
borrowing is “recipient language agentivity” and imposition “source language 
agentivity” — helps constrain notions of which kind of linguistic item is more or 
less likely to be borrowed or to be imposed during contact. Howell (1993: 189) 
represents the inverse relation between borrowing and imposition like this: 


(1) Stability: Borrowing versus imposition 
More open to borrowing > Less open to borrowing 
Less affected by imposition =< More affected by imposition 


Less stable domains: More stable domains: 
lexical items, derivational phonology, inflectional morphology, 
morphology semantic system, syntax 
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1.2 Koineization 


While van Coetsem’s model predicts structural types of contact effects, recent work 
on koineization in closely related dialects helps illuminate developments under 
contact. Koineization, or new dialect creation, proceeds from dialect contact 
through leveling and simplification (Trudgill 1986; 2004; Britain & Trudgill 1999; 
Kerswill 2002; Kerswill & Trudgill 2005; Hickey 2003). In dialect contact, some 
original features may persist. A Founder Principle applies to mixing situations where 
dialect features persisting generally come from the first speech communities 
contributing to the koine (Mufwene 1996). For example, Kwa-speaking slaves, who 
were brought to the west early, shaped plantation African-American English (AAE) 
much more than more numerous Bantu speakers who arrived later (Mufwene 
2001). Generally, dialect features with psychological significance (stereotypical 
or stigmatized, Dillard 1972; Hickey 2000; Kerswill & Trudgill 2005) tend to be 
smoothed over and leveled. This phase thus militates against substrate effects. 
Consequently, koineization theory claims that the only substrate effects persist- 
ing into a new koine serving as an incipient standard (in contrast to a vernacular) 
will be those without psychological baggage. If it is true that this process works 
only when the koine serves as a supraregional variety (Hickey, p.c.), then we are 
unsurprised at substrate effects being incorporated into AAE or vernacular AAE 
features (e.g. from sources such as hip-hop) into general American English. 

This approach does allow for substrate effects; once the koine is stabilized via 
the smoothing out of cross-dialect variation, some variables emerge with new pur- 
pose (“focusing” in Kerswill & Trudgill 2005). Thus an effect emerges over the 
course of two post-contact generations. Kerswill and Trudgill (2005: 200; schema- 
tized in (2) below, from Trudgill 1999, elsewhere) argue that it is important for 
migration to have stabilized by that point as well. This waiting period allowing 
for stabilization of the koine occurs with colonialization, hence Trudgill’s term 
Colonial Lag. Below (section 4.2), we apply this to another apparent imposition, 
final fortition in the Upper Midwest. 


(2) The path of koine formation 


Stage Speakers involved Linguistic characteristics 
I adult migrants (first generation) rudimentary leveling 
Il first native-born speakers extreme variability and 
(second generation) further leveling 
Ill subsequent generations focusing, leveling, and 
reallocation 


As new varieties form, whether through contact among a variety of very differ- 
ent languages or a few closely related dialects, the social significance we assign 
to particular features changes, sometimes dramatically. Consider an example of 
dialect contact in American English, rhoticity.* Rhoticity is a cover term for the 
consonantal articulation of coda /1/ (r-fullness) or non-articulation of /1/ in 
codas (r-lessness). Although coda-final /1/ was likely in recession (see Hickey 2010), 
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early colonial settlement on the Atlantic coast brought speakers of both r-ful and 
r-less dialects — typically r-less varieties in areas settled from southern England, 
typically r-ful in those settled by northern English and Scots-Irish, for instance. 
R-lessness was regarded as “prestigious” (Kurath & McDavid 1961), and spread 
inland in particular directions from urban centers. For example, r-lessness spread 
northward from Boston into Maine but not westward across Massachusetts or into 
upstate New York (Bloch 1939), and also spread westward from the Tidewater 
region in South Carolina from Charleston through the plantation region toward the 
Blue Ridge Mountains (McDavid 1948). Even assuming the original distribution 
reflected settler dialects, explanation of subsequent changes — like the diffusion 
of r-lessness into once r-ful geography from eastern Massachusetts north to 
Maine and inland from Charleston, South Carolina — requires focusing on social 
factors like identity and affiliation to the source location, not migration alone. 
We then see patterns of diffusion defined along such parameters. McDavid 
(1948) noted that younger speakers, female speakers, and urban speakers were 
r-less, characteristics now associated with carriers of innovations. 


1.3 Time lag 


We can still observe the changing status of r-fullness and r-lessness. In the South, 
the switch of r-less prestige to r-ful prestige has been well documented, and social 
variation differs by time and place. Compare McDavid’s (1948) picture of South 
Carolina where female speakers are more r-less to Sch6nweitz’s (2001) survey of 
data from The Linguistic Atlas of the Gulf States (LAGS 1986-92) where female speak- 
ers are more r-ful. In New York City, Labov (1966; 1972) found that innovating 
r-lessness co-varies with social class and argued for a top-down innovation sug- 
gestive of hypercorrection. Finally, variation and change in r-lessness among African- 
Americans confirms the notion that analysis needs to take place locally (Nguyen 
2006). Wolfram’s (1969) Detroit study shows that r-lessness is sensitive to social 
class and gender as well as ethnicity (see also Levine & Crockett 1966; Anshen 
1969; Schénweitz 2001). Wolfram and Thomas (2002) found older AAE speakers 
in North Carolina exhibit a wide range of variability from those who are almost 
exclusively r-less to those who are almost entirely r-ful. Additionally, subtle dif- 
ferences have been found recently in numerous locales. In coastal North Carolina 
(Wolfram & Thomas 2002) younger AAE speakers have increased in r-lessness 
while whites are moving towards r-fulness. Myhill (1988) found that AAE contact 
with white vernaculars leads to r-fulness among AAE speakers, a pattern also found 
in an Appalachian enclave community (Childs & Mallinson 2004) and Northern 
urban areas (Nguyen 2006; Purnell 2009). Moreover, accommodation (when r-less 
speakers use r-ful forms) entails consideration of a number of situational factors 
in a conversation. For example, Baugh (1988) found that r-fullness increased when 
an African-American English speaker was talking to an unfamiliar person of any 
race or a non-black speaker. Downes (1998: 175) rightly argues that “postvocalic 
r is a different sociolinguistic variable” across communities, with its own history 
in each case. In that spirit, our window of analysis must be local. 
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We show below that systematic indeterminacies follow similar patterns of 
social and structural change under contact at the local level. By shifting the 
focus from lexical items to imposition in language structure and to processes and 
timing of koineization, we will find the crisscrossing of social and geographic strata 
by variables yielding distinct subtypes of American English. 


2 A Note on Loanwords 


The one effect of contact readily apparent and universally accepted as such is 
lexical. Some of the best-known discussions of contact in American English focus 
almost solely on that, for example, Mencken’s “Loan-words and Non-English 
influences” (1937: 150-63). Aside from scattered remarks on structure, like the 
intonation of Pennsylvania German English, he restricts discussion largely to the 
lexicon, as does Romaine (2001).° 

In and of themselves, loanwords have had marginal impact on American 
English — they have not changed our stress patterns the way Norman French 
did for English and their morphological integration is seldom distinctive, at least 
for American English (but see Cannon 1984 on zero plural marking on nouns 
borrowed from Japanese). Loanwords have contributed few and minor new 
phonotactic patterns, and differential phonological integration can be found. 

Consider first differences depending on whether a given borrowing came in 
by written or oral transmission: Among German borrowings, for instance, danke 
schon has generally appeared in American speech with the last vowel /e:/. This 
reflects the unrounding of German umlauts found in most dialects imported to 
this country, where Standard German schén [fo:n] is pronounced [ fe:n]. This is 
certainly the case in the Pennsylvania German area, but also a natural way of import- 
ing front rounded vowels into English. In contrast, German iiber ‘over’ (often spelled 
uber in English), long known in philosophical usage and as a part of loan com- 
pounds (cf. Ubermensch), has become a productive prefix. It is pronounced with 
the back vowel [u] rather than front [i]. Japanese is the source language for American 
skosh [sko:J] ‘[a] little bit’, and both the pronunciation and spelling reflect the loss 
of an initial-syllable voiceless high vowel of Japanese rather than the transcrip- 
tion of the Japanese form, sukoshi. Likewise, our name for the rapidly growing 
Japanese vine is kudzu, approximating the native pronunciation rather than the 
Romanization kuzu. 

Second, numerous foreign words and names have been taken into American 
English (and other varieties) with “hyperforeign” pronunciations, as described in 
Janda, Joseph, and Jacobs (1994). They note that some forms retain relatively native 
pronunciations (or have had them reintroduced historically) such as Bach with 
the velar fricative [x] or milieu as [mul'jo]. In many other cases, Americans and 
other English speakers have extended generalizations about particular languages 
to over-adapt loans. For instance, the knowledge that French often does not pro- 
nounce written final consonants (English ballet [bze'le]) and its phonological 
counterpart that French prefers open syllables prompts most Americans to produce 
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coup de grace without its final [s], turning a ‘stroke of mercy’ into a French 
‘stroke of grease’. While the voiced palatal fricative [3] is found in some very well- 
established English words, like measure and pleasure, its foreignness is apparently 
still evident, by its regular extension to Bei[3]ing rather than (the more Chinese- 
like) Bei[d3]ing, and humorously (together with stress shift) in garbage ['gaabid3] 
> [gas'ba3], or the chain store Target as [t"aa'3e]. 

Still, the widespread scholarly assumption of a lack of influence beyond the 
lexicon does not match some folk perceptions about American English in some 
parts of the country. In the Upper Midwest, as developed below, it is unremarkable 
for members of communities with strong ethnic/immigrant identities to assume 
that their personal speech reflects their heritage, even if they are monolingual English 
speakers. If asked about some distinctive-sounding pronunciation, such as “stop- 
ping” of interdental fricatives or final devoicing in a word like beer[s], speakers 
may matter-of-factly say “oh, that’s just the Polish/German/Norwegian coming 
out in me.” We turn now to such imposition-type structural patterns. 


3 American Varieties Shaped by Bilingualism 


As noted, much attention to language contact in the history of American English 
has been addressed to what are sometimes called “ethnic dialects,” such as 
French influences on the English of southern Louisiana, Pennsylvania German 
influence on the English of southeastern Pennsylvania, and Spanish influences on 
the English of broad areas of the Southwest. This sort of literature has often described 
English spoken by L2 speakers. The traditionally discussed English of “Cajuns” 
or “Pennsylvania Dutchmen” was nonnative English, reflecting direct imposition 
from the relevant L1(s), not the residue of earlier language shift. Well into the 
twentieth century, such communities were often not merely bilingual, but often 
heavily non-English speaking. In the Upper Midwest, communities much smaller 
and less isolated than the Cajuns or Pennsylvania Dutch remained monolingual 
more than 70 years after immigration. For instance, up to a quarter of Wisconsinites 
in some German immigrant communities are reported in the 1910 US Census as 
non-English speakers, many of them second and even third generation in the United 
States (Wilkerson & Salmons 2008).° The nonstandardness of English spoken under 
such circumstances reflects L1 interference; it does not reflect stable patterns of 
American English as a native dialect, but rather typically transitional phenomena, 
so that almost all such features have been thought to recede and disappear as 
communities become proficient in English. As such, these patterns are therefore 
not of primary interest, but they provide a seed-bed from which later features 
may emerge via the processes of koineization laid out in (2) above. For example, 
in relatively homogeneous immigrant communities where large numbers of adult 
speakers learned English as L2 during a single generation, this may provide enough 
impetus to plant features in a new generation. As we will discuss below, estab- 
lished features in a given community may later become ethnically unmarked 
regional features. 
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The patterns of distinctiveness and non-distinctiveness of American English as 
spoken by English monolinguals in a whole array of broad and internally diverse 
communities — for example, “Native American” or “Indian English” and “Jewish 
English”’ — provide insights into roles played by contact in the history of American 
English. We will briefly treat both sets of patterns, followed by a word about dialect 
contact in current AAE. We focus on phenomena found among monolingual 
English-speaking members of communities, which typically begin as ethnic or social 
features, due to a shared linguistic heritage. Crucially, such features can generalize 
to become local or regional markers beyond a particular ethnic community. 


3.1 “American Indian English” 


Almost all Native American communities in the United States are today in the 
late stages of shift to English, and the clear majority in almost every community 
is made up of native speakers of English.* In the classic work on American Indian 
English, Leap (1993: 281-2) begins his summary with these points: 


1 American Indian English is an aggregate of English varieties, which differ, as 
a group and individually, from standard English (as expressed through the 
language of the metropolis) and from the varieties of English spoken by non- 
Indians in American society. 

2 The distinctive characteristics of these codes derive, in large part, from their 
close association with their speakers’ ancestral language traditions. In many 
cases, rules of grammar and discourse from that tradition provide the basis 
for grammar and discourse in these English codes — even in instances where 
the speakers are not fluent in their ancestral language. 

3 Other components of Indian English grammar and discourse resemble features 
of nonstandard English; usually, however, these features express meanings 
not attested in other nonstandard codes. The similarities in form should not 
overshadow the significance that these features hold in each case. 


American Indians speak and/or spoke hundreds of languages (see Mithun, this 
volume), are in contact with a range of American English dialects, have acquired 
English under widely different circumstances, and so on.’ Given this situation, a 
central question is whether shared features span tribes and regions. An import- 
ant historical event bears on the answer to this question. In the late nineteenth 
century, the federal government forced many Native children into “boarding 
schools,” where students were made to use English and punished for using their 
native tongues. Leap concludes from a survey of structural features found in board- 
ing school student letters that most reflected L1 interference. The situation led to 
the rise of what Leap calls “codes-under-construction,” as students learned 
English using all available input (1993: 162), but not to the rise of any broader 
unified dialect. 

Two phonological case studies illustrate paths that English is taking in indigen- 
ous communities. First, Rowicka (2005) presents evidence that members of the 
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Quinault Indian Nation in Washington State have begun to use [?] for /t/ in ways 
shared by other Native American dialects of English. Similar “glottaling” is 
familiar from an array of other varieties, including white American vernaculars, 
so this pattern may fit under Leap’s point (3), above, perhaps with koineization. 
Second, Anderson (1999) compares the diphthongs /ai/ and /oi/ in Cherokee 
English (westernmost North Carolina) with neighboring non-Cherokee dialects. 
Monophthongization of /ai/ to [a:] is one of the most characteristic features of 
Southern US English generally, but the two communities show somewhat different 
patterns between [ai] and [a:] (along with different patterns of monophthongiza- 
tion of /oi/): Cherokee English has [ai] in hiatus or utterance finally, while 
Anglos monophthongize across all environments. Anderson argues that this 
reflects a combination of both Cherokee influence and accommodation to local 
norms. 

Quinault English lacks demonstrable imposition on the emerging variety, while 
the Cherokee example may reflect some transfer, but mediated and mitigated by 
patterns found in local varieties. What both have in common, then, is that the 
current situation shows relatively subtle linguistic differentiation from surround- 
ing Anglo vernacular usage. To our knowledge, nowhere have uniquely Native 
American features of English yet spread into systematic usage among non-Indian 
neighbors or contact populations. 


3.2 “Jewish English” 


A clear contrast to this situation is found in “Jewish English.” These varieties are 
often associated with Yiddish influence, but the designation covers a vast range 
of groups of very varied cultural and linguistic heritage. Many discussions cor- 
relate religious observance roughly with linguistic distinctiveness, from Hasidic 
communities where Yiddish is learned as L1 to those who consider themselves 
“culturally Jewish” but are not observant and whose speech may differ from other 
local varieties only by a few words. For part of this spectrum, Benor (2004, chs. 4, 
5) gives a concise picture of English spoken in Orthodox communities, covering 
lexical and structural features, as well as discourse and pragmatic patterns. 

A number of features strongly associated with Jewish speech have become unre- 
markable (if recognizable) parts of English for non-Jewish Americans. Consider 
these examples from syntax and phonotactics, domains susceptible to imposition. 
First, a stereotype of Jewish speech is the topicalization of indefinites, as in these 
examples (all examples from Feinstein 1980: 15): 


(3) Indefinite topicalization 
Some milk you want? 
A hotel she lives in. 


Feinstein’s questionnaire results indicate that New Yorkers found such sentences 
more acceptable than non-New Yorkers. While Jewish New Yorkers reported the 
highest use, he still considers this a New York feature. Though associated with 


Contact and the Development of American English 463 


Yiddish, this is no simple transfer. First, other languages spoken in New York 
City allowed similar topicalization, including Germanic ones, and Irish-derived 
communities would have shown a preference for a similar topicalization, via 
clefting. Second, other varieties of American English allow a closely related 
topicalization, namely of definites: 


(4) Definite topicalization 
The book by Bellow I already read. 
Him I like. 


Feinstein argues that “Yiddish helped to extend the domain of an analogous exist- 
ing rule in English” (1980: 22), rather than introducing a basic change. Moreover, 
the extension remains incomplete: Yiddish allows topicalization of any element, 
including nonfinite verbs, while the New York pattern does not. 

Turning to phonotactics, English has historically had word-initial /s/ + con- 
sonant clusters but not /f/ + consonant clusters. Numerous Yiddish loanwords 
have become ubiquitous without assimilating to the native sC pattern — shlep 
‘to drag’ and shmooze ‘to chat, especially currying favor’ with [J]. The Yiddish 
pattern matches patterns common to other immigrant languages, such as 
German, where most varieties allow only /f/ + consonant clusters, as found in 
common pronunciation of proper names like Schlitz and Schmidts (both family 
names and brands of beer). Intriguingly, Durian (2007) reports a change in 
progress of /s/ to [J] before consonants, especially sCr- clusters like strong but 
any possible connection between the change in progress and language contact is 
too tenuous to entertain. 

All these features have generalized beyond formerly Yiddish-speaking com- 
munities, but the history is far richer than direct transfer or imposition from Yiddish 
(or other Jewish languages) onto L1 English. In the first example, an existing English 
pattern was extended. In both, changes were supported by parallels in other 
languages present in the communities. Other features have spread, including 
shm- reduplication (fancy shmancy), or features supported by multiple linguistic 
sources, like coda [ng] for expected [n] in singer or long (also a Slavic pattern). 


3.3. “African-American English” 


The above examples show clear, direct links between bilingualism and contem- 
porary speech. AAE had equally clear roots in language contact, as speakers of 
many African languages became speakers of English (Wolfram & Thomas 2002: 
12-31). The development of AAE is also consistent with imposition, koineization, 
and timing. Mufwene (2000) argues that AAE results more from the convergence 
of American English and founding African varieties at various times and loca- 
tions before the Reconstruction of the South than directly from an American or 
Caribbean creole. The resultant reallocated forms are neither wholly American 
English (e.g. gon(na), Poplack & Tagliamonte 1996) nor wholly African (e.g. the 
associative plural, Boretzky 1993). 
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Today, AAE varieties are more closely tied to local linguistic ecology than to 
a hypothetical pan-AAE variety. AAE is in contact with essentially all other 
varieties of American English. Recent literature suggests that AAE speakers are 
both accommodating in part to local white vernacular while distinguishing 
themselves by nonparticipation in changes sweeping through the rest of the local 
population (Myhill 1988; Bailey 1997; Childs & Mallinson 2004). For example, AAE 
speakers who live in the “Northern Cities” area are not fully participating in the 
vowel restructuring characteristic of urban, white speech (Gordon 2000; Thomas 
2001)."° Recent research in the Upper Midwest has begun to explore the role of 
contact (e.g. Purnell 2009) and work to date suggests that these groups are both 
participating in limited ways, and retaining either pan-AAE features or adopting 
local white features. 

Since the late 1980s, controversy has flared about the relationship of AAE to 
general American English, with some arguing that AAE was/is diverging from 
white vernaculars and others arguing for convergence. This controversy is today 
perhaps best understood at the local level because the psychological value of 
reallocated features is as local as the contact between speakers within a single 
community (Bailey & Maynor 1989; Rickford 1999). We conclude that the shifting 
salience of features, along with the leveling of features that appear to have lost 
significance under contact, reinforce linguistic differences across communities. 

The generalization across all the three settings mentioned above is that speakers 
are navigating a complex cluster of structural and social options in the ways they 
speak. Much of what we see is imposition, save for its absence in Quinault English. 
Borrowing is also present in these communities, as English L1 speakers acquire 
elements of their ancestral indigenous languages in Native American commu- 
nities, for instance when borrowings, which presumably began as emblematic 
code-switching with greetings and leave-takings, take place. Another instance would 
be where some Yiddish or Hebrew is acquired in the case of “Jewish English” 
(see Benor 2004 on sociolinguistic complexities of this process). These, then, help 
shape the discourse and pragmatic systems of the variety. 


4 Case Study: From Immigrant Languages to 
Regional Varieties 


We now shift focus to an area where features that a couple of generations ago 
were directly identified with immigrants’ nonnative English have now become 
regional, and are widely used beyond members of the particular ethnic commu- 
nities that introduced the features into the mix. 

The potential role of influence from immigrant languages has been noted in 
the recent literature, notably by Mufwene (2006: 178, emphasis ours): 


...Nationalistic parochialism among European colonists, which lasted until the 
early twentieth century, must have reduced the extent of influence that continental 
Europeans could have exerted on structures of North American Englishes. Although 
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Figure 22.1 German nativity in Milwaukee as reported on the national decennial 
census, 1870-1950 (Immigration numbers were extracted from US Census data, 
accessed through the University of Virginia Library, http: //fisher.lib.virginia.edu/ 
collections/stats /histcensus) 


the overall demographic proportion of these immigrants exceeded that of the 
Britons in America by the nineteenth century, the incremental pattern of their 
growth, including the later immigration of some groups and their gradual absorp- 
tion into the prevailing Anglo socio-economic structure, weeded out much of the 
substrate influence they could have exerted on their present English vernacular. 


This section sketches some patterns that have resisted this weeding out process 
and the reasons why they might have. We draw here on the Upper Midwest,” 
where the demographic history of immigration is relatively well understood and 
the emergence of salient regionalisms is relatively recent and well documented. 

Southern and eastern Wisconsin received repeated surges of immigrants. The 
largest immigrant group in the region, Germans in eastern Wisconsin, provide 
a handy focus for this discussion. Consider census reports of German nativity in 
the largest city, Milwaukee, between 1870 and 1950, in Figure 22.1, where nativity 
increases from 1870 to 1910, then falls. While each community’s history is unique, 
the chronology is typically similar across the region, as Radzilowski (2007: 217) 
confirms: “Between 1870 and 1900 the settler immigration that helped fill up the 
rural areas of the Midwest gradually tapered off.” Let us correlate this timing to 
the source of features and timing of the emergence of reallocated features of the 
contemporary dialect of the region. 
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We sketch four patterns associated with the Upper Midwest, each with a dis- 
tinct relationship to contact: the “stopping” of interdental fricatives (section 4.1), 
final fortition (or “devoicing”) (section 4.2), the use of with as a verbal particle 
with verbs of motion (section 4.3), and innovations in the use of other adverbs 
(section 4.4). 


4.1 Interdental fricative stopping 


When asked what makes for a regional accent in various parts of the Upper Midwest 
— including Michigan’s Upper Peninsula, Wisconsin, and Minnesota — one of the 
most common responses from laypeople is “dem,” “dere,” “dose,” the use of [d] 
or [dd] for interdental fricative /6/, along with [t] or [t@] for /68/, as in upnort 
‘up north’, a stereotypical reference to Wisconsin’s Northwoods. Among older 
speakers in one heavily German-ethnic community in eastern Wisconsin, Rose (2006) 
sees this feature, which she calls “stopping,” as key to “performing Germanness” 
among her speakers, who included German-English bilinguals and English 
monolinguals.’* Rose finds that speakers of English or Irish heritage are equally 
likely to use stops for fricatives. In particular the use among Anglo-Americans 
indicates that the feature no longer only conveys ethnicity, but has been partially 
reinterpreted as social variation. Among other things, the feature varies by 
gender — more widely used by men than women - and by social context — with 
high rates among those playing skat (a German card game popular in Wisconsin) 
and bingo, but low among bridge players, for example. 

While Rose studied a community with heavy German (and Dutch) roots, stop- 
ping is likewise a central part of performing other ethnicities. It was common in 
many languages brought to Wisconsin and some English dialects, including Irish 
English (Hickey 2004). In koine formation, features present in multiple input vari- 
eties, thus not readily identified with one particular group, tend to survive the 
leveling process into the new variety (Kerswill & Trudgill 2005). The multiple 
origins of stopping promoted its propagation in the state among even those of 
English heritage. 

From a traditional historical linguistic perspective, it would be easy to deny 
that the present-day stopping in the Upper Midwest was directly triggered by immi- 
grant language interference, given that varieties of English spoken around the globe 
have lost interdental fricatives, sometimes where appeal to language contact is 
implausible at best, like southeastern England. Moreover, language-internal accounts 
of this change abound, notably perceptual and articulatory accounts. All these fac- 
tors favor stopping. At the same time, the feature is stigmatized in some settings, 
and may be receding in some areas, though it is certainly used among young speak- 
ers. As Blevins (2006: 20) concludes generally about these segments in English: 


In an imaginary natural history of English, untainted by literacy, prescriptive norms, 
social conventions, and language contact, the loss of /6/, /8/...Wwould likely 
be complete in all varieties of Modern English... However, external factors have 
intervened — among others, the infiltration of American Broadcasting English to ever 
more remote corners of the Earth. As a consequence, these phonemes and the 
contrasts they take part in are hanging onto life... . 
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In the Upper Midwest, speakers apparently transformed stopping, once a sign of 
L1 interference, into a widespread but socially stratified feature found throughout 
a broader region among speakers of varying ethnic background. Again, we do 
not see direct transfer of an immigrant-language feature onto the regional dialects. 
Instead, as in section 3, a socially and structurally more complex story is 
required: As noted, features found in the speech of many different groups — so 
not identified with any single group — preferentially find their way into the koine. 
As with all features, psychological significance changes during reallocation. 
Stopping has become disconnected to some extent, as Rose shows, from ethnicity, 
being found now among Anglos, but has taken on new social meaning. 


4.2 Final fortition 


Another “reallocated” feature integral to the koineization of Upper Midwestern 
English is final fortition, or neutralization of the “voicing” contrast in word-final 
position.’ American English contrasts words like his and hiss or bat and bad, for 
example, while in German Bad and bat are pronounced [ba:t], Rad and Rat as [xa:t]. 
While devoicing in word-final position has been attributed directly to German 
influence, fortition or devoicing can also be found in Polish, Dutch, partially in 
Scandinavian, and in some dialects of Yiddish. The following are stereotypical 
pronunciations. 


(5) Stereotypical final fortition in the Upper Midwest 
Da Bear[s]! (Saturday Night Live television skit, Chicago) 
I’m going to wash my hair[s]. (Milwaukee) 


The fact that this feature was used as a stereotype of regional speech on national 
television suggests a perception of devoicing as being connected to the Upper 
Midwest. Those skits traded, as well, in stereotypes of specifically Polish-American 
speech, while the Milwaukee example portrays German-American speech." 
Phonetically, “voicing” is not only evident in vocal fold vibration during the coda 
obstruent, but also in the length of vowels preceding the consonant (e.g. Heffner 
1937; Parker 1974). These two features — pulsing and vowel duration — can be con- 
sidered as overall measures of the change in voicing over successive generations 
of speakers from southeastern Wisconsin (see Figure 22.2 below, from Purnell et 
al. 2005b; see also Purnell et al. 2005a). This figure depicts changes across four 
sets of speakers: English speakers with birthdates between 1866 and 1892, 1899 
and 1918, 1920 and 1939, and 1966 and 1986. The top point represents the aver- 
age location for /d/ for that group and the bottom point is the average location 
for /t/. What is important here is how the speakers differ from the canonical stan- 
dard relation between /t/ and /d/. We expect that the /t/ points would be in 
the lower left corner and the /d/ in the opposite corner, the upper right corner. 
Those speakers born before the peak in German nativity (in Figure 22.2) use 
glottal pulsing for voicing and largely ignore vowel duration. The middle 
chronological group is most like the generally assumed English pattern. They were 
born after German immigration had waned. The youngest group trades off 
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Figure 22.2 The historical development of final fortition in Wisconsin (from Purnell 
et al. 2005b: 331) 


between vowel duration and pulsing, but it is clear that their /d/ is more /t/- 
like. These speakers are two generations removed from the middle group born 
in the German nativity decline. Once again, a pattern that survives into contem- 
porary vernacular represents a feature present in a variety of immigrant source 
languages. We interpret this as final fortition’s reallocation. 


4.3 Verbal particle with 


Consider now a grammatical structure, the use of adverbial/particle with together 
with verbs of motion, as in these sentences: 


(6) Verbal particle with 
We're going now. Are you coming with? 
Are you taking your phone with? 


Most English speakers would use along in this context, or just the verb without an 
adverb at all. This construction is widespread not only in the Upper Midwest but 
also in the Lower Midwest, for example in Illinois and Indiana and in historically 
Germanic areas of Pennsylvania (Wolfram & Schilling-Estes 2006).'® This feature 
is also popularly associated with German heritage, presumably based on famil- 
iarity with the parallel construction in German, which has so-called “separable prefix” 
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verbs with mit- ‘with’: mitkommen ‘to come along’, mitnehmen ‘to take along’, and 
so on. Like stopping, it has multiple immigrant sources, namely across the 
Germanic family: 


(7) Germanic models 
German: Er kommt mit. ‘He comes along/with’ 
Dutch: — Hij komt mee. 
Danish: Han kommer med. 


This feature has close parallels in English verb particles, such as to come to ‘to regain 
consciousness’ so that this constitutes a lexico-semantic innovation for a set of 
verbs, not a more broadly syntactic one. Most Germanic languages have large, 
productive sets of such “separable prefixes,” yet no American English dialect has 
acquired any beyond with. The feature differs socially from stopping in not being 
stigmatized, perhaps in part (as Raymond Hickey suggests to us) because it is 
less frequent and thus less salient than fortition. 

Perhaps because this feature is not stigmatized, it appears to be increasingly 
widespread throughout the Midwest and beyond, though it remains foreign, to 
our knowledge, in the South. Many speakers in the Upper Midwest use it across 
a wide range of styles and registers, and some are surprised to learn that it is a 
regional feature. In this case, a seed sown in the region’s immigrant past has flour- 
ished and spread as predicted by the koineization process. 


4.4 Other adverbial changes and discourse patterns 


We have explored examples where more than one source language contributed 
to the success of a feature in regional vernaculars. Having a source in multiple 
input languages is not necessary for change via contact. Especially in eastern 
Wisconsin (but stretching westward), we find direct imposition of features from 
German (with further support from Dutch), where “modal particles” like mal are 
used to soften requests and convey other information about speaker attitudes and 
intentions. The English translation of that word, once, is today used in the same 
function: 


(8) Modal-particle-like adverbs 
come (over) here once (German: komm mal her) 
‘just come over here; come over here, won't you?’ 


Similar pragmatically oriented transfers are found throughout the region, includ- 
ing tag questions, like Wisconsin ainna?, < from ‘ain’t it?’, calqued from German 
nicht wahr?: “Hot today, ainna?” (from Cassidy & Hall 1985: s.v.). The same 
appears to be true with exclamations, like (American-) Norwegian uff da! ‘oh darn’. 
Still, such single-source features appear to be less common and less central 
grammatically. 
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4.5 Emergence of the regional pattern 


We have argued that the Upper Midwest’s immigrant past helped guide the region’s 
path to increasing linguistic distinctiveness. That path is indirect, involving koine 
formation set in motion at the end, rather than beginning, of major immigration 
to the region. At the outset of this chapter, we noted that three characteristics of 
language contact are emphasized by the data we presented in this paper. These 
characteristics include the following. 


1 Imposition on particular types of structural domains 

2 The process of leveling and reallocation/hypercorrection from potentially 
multiple sources 

3 The timing of the lag after immigration for leveling stabilization 


The examples laid out in section 4 exemplify these patterns. First, imposition is 
found in examples from phonology and syntax (i.e. from the right side in van 
Coetsem’s scheme in (1)). The Upper Midwest also shows borrowings, but they 
are less instrumental to regional dialect formation (cf. Cassidy & Hall 1985- on 
kaffeeklatch, lutefisk, etc.). Second, the process of koineization is exemplified by 
multiple immigrant languages contributing to the leveling process by sharing 
final fortition, stopping, and the come with pattern. Third is the timing of the 
development of the new variety. The historical patterns of immigration appear 
to match the “lag” associated with koine formation, where new varieties arise 
only generations after immigration ceases. 


5 Summary and Conclusions 


We opened by quoting Hills’ arguments against substrates in American English. 
We close with Meillet’s response to Hills, who had sent Meillet a draft of the paper, 
(in Hills’ translation): 


In order that there may be a linguistic substratum, it is not necessary that there should 
be only one. A complex substratum, especially, brings modifications which it becomes 
impossible to evaluate by reason of their complexity and their variety. The fact that 
French is spoken at the present time in Paris by a majority of provincials and descen- 
dants of provincials and of foreigners and descendants of foreigners is, I believe, of 
great importance. At first sight, the effects are not appreciable, but the fundamental 
result is that “Parisian” is disappearing, drowned in a sort of Koiné (common 
speech), just as Attic formerly disappeared drowned in the Greek Koiné. The 
idiomatic character of “Parisian” is being progressively effaced. I can scarcely 
believe that the great mixing of population that is taking place in the United States 
will not have a similar effect. A banal Koiné is being produced. (Hills 1929: 431) 


These comments are similar to the views of many (e.g. Mencken 1937: 356). 
One thrust of modern variationist sociolinguistics has been demonstrating the 
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increasing overall diversity of American English. We raise the question here of 
whether the diversification of American English today could be seen as the 
slow-motion resolution of the contacts encoded in our history. The geographical 
differentiation may spring in part from the local working out of koine forma- 
tion and language shift in these communities. Non-English features enter the 
pool of variants available for incorporation into local and regional speech. Over 
post-immigration generations, these variables take on social and regional import 
rather than ethnic/heritage-language meanings. American English has not fully 
crystalized into one coherent whole, but is developing a set of parochial or, in 
contemporary terms, regional koines. 

These complex historical paths do not arise ex nihilo from contact but often involve 
extensions of patterns historically present in English, where acquirers or adults 
seeking to accommodate to new patterns of speech could tweak the inherited 
pattern to produce the new structures. We see this syntactically in both New 
York indefinite topicalization (drawing on topicalization of definites) and Upper 
Midwestern verbal particle constructions of the come with type (drawing on 
verbal particles of the come to type.) Neither is really a fundamental syntactic 
change such as word order changes, and neither has developed the full range 
found in the immigrant language (topicalization of any constituent in Yiddish, 
much larger sets of verbal particles in German). The negotiation of structural 
change from contact in American English gives every impression of expanding 
possibilities and generalizing on restricted processes. 

Also temporally, these effects show clear signs that they are not mundane 
transfers from other first languages categorically imposed on English. In fact, the 
features were never actually absent from the local varieties, but they may appear 
to be submerged for a generation or two, only to re-emerge later. Similar time 
gaps between contact and the rise of such features once counted as a reason to 
discount “substrate” accounts, but they fit nicely with koineization theory. 

The features treated follow van Coetsem’s stability gradient generally, but 
add some nuance. Borrowing of lexical items is widespread, we see derivational 
morphology borrowed (prefixal iiber), and we find imposition in phonetics/ 
phonology (final fortition), for example. But borrowing has created new phono- 
tactic patterns (like fC-) and imposition has been gentler and more indirect 
than we might expect, whether in the syntactic extensions or the time lag in final 
fortition. Time and again, we see the interplay between “internal” or structural 
and “external” or social factors in the origins and transmission of change.” 


NOTES 


We owe Raymond Hickey our gratitude for an invitation to write this, and we thank the 
following for a wide variety of comments, discussion and inspiration on this topic and 
this paper: Angela Bagwell, Matt Bauer, Erica Benson, Jennifer Delahanty, Bernd Heine, 
Rob Howell, Neil Jacobs, Brian José, Monica Macaulay, Salikoko Mufwene, Mike Olson, 
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Eric Raimy, Kate Remlinger, Becky Roeder, Mary Rose, Luanne von Schneidemesser. The 
usual disclaimers, of course, apply. 
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There are notable exceptions. Wolfram and Schilling-Estes (2006) deal with structural 
contact effects on American English, such as Pennsylvanian come with (cf. section 4.3 
below) and derivational morphemes. 

This stands in sharp contrast to the early impact of contact, which led to the forma- 
tion of a variety of pidgins, cf. Dillard (1992). 

Note that the onset of what is arguably the most widespread vowel change in 
American English — the low back merger of vowels in word pairs such as caught 
versus cot — has been argued to be an instance of substrate influence by Slavic 
immigrants on the local variety of western Pennsylvania (Herold 1990, cited in Labov 
1994: 318). 

See Downes (1998: 150-75) for details on rhoticity as a variable in American English. 
Much remains to be done even on loanwords. The key resource for tracking lexical 
results of contact is the Dictionary of American Regional English (Cassidy & Hall 1985-), 
and it becomes especially valuable when used in conjunction with the volume indices 
like DARE (1993) and von Schneidemesser (1999). 

Note that such situations challenge the widespread assumption that time of immigration 
reflects the beginning of bilingualism. 

“Scare quotes” are used as a reminder that each of these terms is a cover for myriad 
very different dialects and sociolects. Recognizing that “Indian English” is the name 
for the English spoken in India, we use “American Indian” or “Indian” in this paper 
alongside “Native American,” but also in line with current norms within Native 
communities and beyond. 

Active preservation and revitalization efforts are underway in virtually every Native 
community, but, as throughout, we focus on L1 English speakers. 

The Lumbee (Wolfram 1996, among others) provide an example of Native Americans 
who are monolingual English speakers, maintaining unique speech patterns while being 
influenced by contact with outsiders (Schilling-Estes 2002). 

Briefly, the Northern Cities Chain Shift is a restructuring of the vowel space where 
the most salient shift, /ae/ raising, is accompanied by /1/>/e/>/a/>/3/>/a/ 
S/e/: 

We define “Upper Midwest” here following the Center for the Study of Upper 
Midwestern Cultures, to wit as: “Although the exact contours of the Upper Midwest 
are open to debate, most arbiters apply the term to Minnesota, Wisconsin, and the 
Upper Peninsula of Michigan (with overlap into lower Michigan, Ontario, Manitoba, 
the Dakotas, Iowa, Illinois . . .).” 

See also McCarthy (2007). 

We follow Iverson and Salmons (1995; 2007) and many others, taking laryngeal dis- 
tinctions in English and German to be better captured by considering aspiration 
(“spread glottis,” “fortis,” etc.) to be the distinctive feature rather than “voice,” as found 
in Dutch, Yiddish, Polish, and so on. 

This reflects a nonstandard use of hair as a count rather than a mass noun, at least 
in part attributable to German influence, and represents part of a broader pattern, cf. 
a scissors and various similar forms. 

The syntactic analysis of this construction — whether the with is better understood as 
a “verb particle” construction or adverb — is irrelevant for present purposes. 

It is also reported as a feature of South African English, presumably by a similar 
historical path from Afrikaans. 
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17 In the extreme case, Quinault English, it appears that we have language shift without 
lingering L1 transfer into the new variety, but at the same time, the emergence of a 
new linguistic trait (glottalization) that draws on a familiar pattern in English and shows 
affinity to certain groups, namely other Native communities. But this outcome 


appears to be the exception, not the rule. 
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23 Contact Englishes and 
Creoles in the Caribbean 


EDGAR W. SCHNEIDER 


1 Introduction 


The Caribbean, defined here as the chain of Caribbean islands and the adjacent 
mainland of Central (cf. Holm 1983) and northern South America, has been a region 
marked by linguistic and cultural contact to an exceptional degree. Sadly 
enough, the original indigenous population, native Americans of the Arawak and 
Carib tribes who had themselves moved to the region from South America, 
played practically no role in the early contact situations. They were small in 
numbers anyhow, and in practically all cases did not survive the challenge posed 
by European immigrants (i.e. both the Europeans’ military superiority and the 
effect of germs against which the indigenous peoples had no resistance) for long. 
Thus, contact in the Caribbean in its early phases was unusual because to all intents 
and purposes it started from a tabula rasa-like situation. It mainly involved two 
broadly defined, ethnically distinct population groups between whom the distri- 
bution of power was exceptionally unequal: Europeans and Africans. 

European powers active in the Caribbean were basically the main seafaring 
peoples of the colonial period: originally the “discoverers” of the “New World,” the 
Spaniards, soon followed by the Dutch, the French, and, considerably later, the 
British. The region was attractive for its agricultural potential, especially from 
the mid seventeenth century onward when the sugar industry promised immense 
profits. The Europeans established plantation economies on the Caribbean 
islands, settled by a small stratum of adventurous society leaders. These were 
attracted by the desire for power and profits, as was a larger number of laborers 
who either sought a better fortune which would have been out of reach at home 
or who were shipped off involuntarily —- as prisoners, indentured laborers, 
debtors, rebels, prostitutes, and so on (Holm 1994: 338). Politically speaking, the 
Caribbean has had an amazingly checkered history, with many islands changing 
hands repeatedly, either by force or through agreements made in any of the peace 
treaties of far-away Europe throughout these centuries, with several colonial pow- 
ers competing for the wealth that the region offered. This competition took many 
forms — from settlement initiatives and political claims of sorts via open warfare 
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to diplomacy, and, last but not least, the illegal but common and effective tactics 
of buccaneering. 

Conversely, Africans, who soon came in much larger numbers, had no oppor- 
tunities or choices. Millions of Africans from various West African coastal regions 
were enslaved and forcibly transported to the Caribbean to work the plantations 
and to satisfy the ever increasing demand for cheap and exploitable labor, espe- 
cially in the sugar industry. This most deplorable chapter of human history gener- 
ated a wide range of complex contact situations, in many different localities 
and involving an array of different peoples. Like the Europeans, the West African 
ancestors of much of today’s Caribbean population represented a range of peo- 
ples, cultural traditions, and languages. All situations, however, were marked by 
extremely unequal power distributions between white planters and their support 
staff on the one hand and black slaves on the other. Numerically, the balance shifted 
in the course of time, with the Africans growing in proportion and constituting a 
strong majority at most localities after a certain point in time. 

The result of this situation is well known, and fairly well documented: creolization. 
Throughout the Caribbean we find creoles spoken by the majority of the popu- 
lations, mixed languages with input from both the European side (usually pre- 
dominantly in the vocabularies; cf. Allsopp 1996; Cassidy & Le Page 1980) and 
from African sources (arguably most effective on the grammatical level, though, 
as we will see, the exact amount of this impact has been under dispute). Reflect- 
ing the positions of their speakers in the social hierarchies involved, these two 
components have conventionally been labeled “superstrate” and “substrate” 
input, respectively. While the fact that contact and mixing occurred and have 
fundamentally determined the language situation of the Caribbean is uncontro- 
versial, in detail conditions varied from one locality and polity to another, and 
so we find a wide range of contact results, from “deep” or “radical” via inter- 
mediate or mesolectal to “light” creoles. These varieties in turn can be accounted 
for as products of either postcolonial approximations to the standard varieties 
in their vicinities, or as different degrees of restructuring in the first place. How 
the interrelationship between these has to be understood is not always clear 
and is sometimes controversial, and it definitely may be frustratingly complex. 
For example, in the two-island state of Trinidad and Tobago, the most creole and 
“deviant” speech forms of Trinidad correspond roughly to what is only mesolectal 
on Tobago, where due to historical reasons an even deeper form of creole is widely 
available. 

The notion of “Caribbean English Creoles” (CEC) is a useful and widely estab- 
lished cover term for these varieties, though it should be conceptualized (and under- 
stood in the present context) to encompass the entire range of English-related 
varieties found in the region. 


2 Historical Background 


In the sixteenth century Spain practically held a monopoly on trade and was the 
predominant regional power, but the French, the Dutch, and then also England 
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began to challenge this position. The motives for expanding into the region were 
in fact different: While Spain was driven by a curious mixture of desires for exploita- 
tion but also conversion of native populations, the primary interest of the Dutch 
and also others was trade. England’s activities soon turned to settlement, how- 
ever, in the Caribbean as in North America: The New World promised land and 
work, while England in the seventeenth century suffered from a shortage of the 
former and a surplus of men (Parry & Sherlock 1981: 52). St. Kitts was the first 
British settlement in the Caribbean (1624), and an important early point of dis- 
persal (Baker 1998); but others followed soon, and in rapid succession: Barbados 
(where Robert Powell, on his way back from Brazil, took formal possession in the 
name of the King in 1624 and then persuaded London merchants to launch a 
settlement initiative), St. Croix (1625, shared with the Dutch), Tobago (1625 saw 
an early but at that point unsuccessful settlement attempt, the first in a long series 
of varying ownership claims), Nevis (settled in 1628, and providing the earliest 
example of the common Caribbean pattern of internal migration, with settlers 
having come from St. Kitts), Antigua and Montserrat in 1632, and so on. In the 
early decades smallholders and indentured servants produced primarily maize 
and tobacco for the European market, under difficult conditions, struggling 
for patronage and to secure land titles in London, more often than not fighting 
competing investors. African slaves, the first ones imported by the Spanish a 
century earlier, were around, and their numbers grew, but they were still a 
definite minority, and so they had more exposure to and better opportunities 
for gradually adjusting to the speech forms used by whites. 

However, around 1650 things changed drastically, and revolutionized the 
economic basis and the racial composition of the islands. The production of sugar, 
introduced about a decade earlier, brought immense profits — but unlike tobacco 
growing it requires factories to produce it, thus huge capital and large planta- 
tions in order to be successful. This meant a great demand for and large-scale 
importation of slave labor, and thus within a relatively short period of time a 
complete reversal of the black-white demographic proportions. Two locations 
epitomize this new state of affairs more than others. In Barbados large numbers 
of small proprietors and tobacco farmers gave up and sold their lands; huge estates 
grew, and the proportion of blacks to whites changed from ca. 6,000 : 40,000 in 
1645 to a strong black majority of 46,000 : 20,000 in 1685 (Parry & Sherlock 1981: 
69). Jamaica, conquered by Cromwell’s troops in 1655 and soon to be Britain’s 
success story in the region, did not evolve as a settlement colony but rather 
became the prototypical sugar island with a predominantly African-derived 
population, at a black : white ratio of about 15 : 1 (Cassidy 1961: 16; Lalla & D’Costa 
1990: 23). 

Relatively little is known about the exact details of social contacts, and espe- 
cially interethnic relations, during that period, as the interest of contemporary reports 
tends to focus on commercial aspects and the white perspective. Slaves came from 
the entire West African coast between today’s Sierra Leone via the Ivory, Slave, 
and Gold Coasts to the Niger Delta, an area with a complex culture, considerable 
agricultural expertise, and also a local slavery tradition. The conditions of life for 
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slaves on the sugar plantations were cruel, marked by a rigorous discipline and 
brutal penalties. The proportion of Europeans was relatively low, and relations 
between the races were mostly distant to hostile — fear of slave uprisings was a 
constant reality on the islands. In the eighteenth century there was still a small 
number of poor whites around who did not fare much better and interbred with 
slaves (Parry & Sherlock 1981: 70). Later legislation improved their situation to 
avoid an even greater ethnic imbalance (felt to be a threat to safety). The European 
languages, the languages of those in power and as such the established lingua 
francas, were the common targets of the slaves’ language acquisition, but access to 
them was highly restricted, so on many islands the conditions for creolization were 
met. Throughout the eighteenth century slave importation went on continuously 
and in large numbers, because “slaves in those days were short-lived” due to ill 
treatment and epidemics, and “replacement was more economical than rearing 
slave children” (Parry & Sherlock 1981: 70). As far as we can tell, political turmoil 
caused by American independence or volatile sugar prices did not much affect 
life on the plantations (and thus the language contact settings) until Emancipation. 
Petty terror by overseers was probably more of a daily reality than constant 
terror, and some slaves worked on small plots of their own (and some even man- 
aged to become free), so even in terms of social relations and realities there was 
a cline of options and different practices, but one clearly and decidedly leaning 
toward the down end from the slaves’ perspective. 

Emancipation, long prepared by political debates in England and preceded by 
the official end of the slave trade in 1808, came in 1834, though another four years 
of “apprenticeship” followed, a period during which slaves had only limited rights, 
the assumption being that they would need to be prepared for freedom. For slaves 
freedom brought some new options and removed the constant threat to life and 
to physical unscathedness, but actually for many of them things did not change 
much in practice (Alleyne 1980: 186). After all, lack of skills or property did not 
leave any real choices to most of the black people, and working the sugar cane 
was not any easier afterwards. The old plantation system was dissolved only 
gradually and in parts, but in many places it resurfaced in new but similar forms. 
One thing that did emerge during this period, however, was the social pattern of 
small-scale peasantry, especially in Jamaica, with former slaves who managed to 
secure a plot of land for themselves becoming smallholders and leading a rural 
life, relatively isolated from developments in the outside world (Patrick 2008: 127). 
To some extent this is a form of life that persists to the present day. 

However, in some countries the end of slavery did have severe consequences 
for the society’s demographic make-up and hence the conditions of linguistic 
contact. The continuing demand for cheap labor led to the immigration of large 
numbers of indentured workers from India who then stayed on in their host 
countries and added a new factor to the contact equation. This applies to 
Trinidad and Guyana in particular — in these two countries roughly half of 
today’s population is of Indian ancestry. While these Indians set up their own 
communities and have upheld many of their social and religious traditions, by 
now many of them have given up their ancestral languages and shifted to local 
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forms of creole English. In Guyana, Indo-Guyanese are predominantly rural and 
speak a basilect, while Afro-Guyanese are more urban and tend to speak a 
mesolectal variety; there have been repeated upsurges of ethnic tensions between 
both groups. 

The present-day “anglophone” (i.e. mostly CEC-speaking) Caribbean consists 
predominantly of island and mainland states with a British colonial history — large 
ones like Jamaica, Barbados, Trinidad and Tobago, Guyana, the Bahamas, and Belize 
(former British Honduras), and smaller ones like Antigua, Barbuda, Grenada, 
St. Vincent, St. Lucia, St. Kitts, Nevis, the US Virgin Islands, the Turks and Caicos 
Islands, or the Caymans. In addition, there are a few English-speaking pockets 
stranded for some reason in regions which are predominantly Spanish-speaking: 
in the Bay Islands of Honduras (settled mainly from the Caymans), in Panama 
(where the construction of the Canal had attracted workers mainly from Jamaica), 
in Costa Rica (spoken by workers on banana plantations), or along Nicaragua’s 
Miskito Coast (Holm 1983; 1986). Again, contact conditions vary from one place 
to another. Ethnic composition has been a decisive factor in this - some degree 
of interbreeding has always been a part of Caribbean life, and traditionally a loose 
correlation between skin complexion and status has been in effect in the region’s 
social stratification. 


3 Contact Parameters 


Language contacts in the Caribbean in the present context resulted primarily from 
encounters between Europeans (British people in the present case) and Africans, 
and secondarily also between Africans with different cultural and linguistic origins. 
We have very little documentation of the exact nature of such interactions in 
the early phase (which obviously is strongly relevant for shaping the ultimate 
linguistic outcome, based on the “founder principle,” Mufwene 1996), but we can 
make educated guesses, as it were, based on what we know about demographic 
data and the social structure of these communities in earlier centuries. 

As is well known, the basic precondition for creolization to occur is a large dis- 
proportion between a small number of target language speakers (i.e. Europeans) 
and a large majority of learners (ie. African slaves) adopting the language. 
Certainly this was the case on many plantations, where the social setup was 
characterized by a preponderance of slave labor, frequently absentee ownership, 
and the presence of relatively few whites with managerial functions, mainly as 
overseers. However, all of these factors primarily apply to sugar plantations so 
they fail to adequately grasp the phase prior to the sugar period and in locations 
with other economic bases. Therefore, we have to distinguish between an early 
“homestead phase” (Mufwene 2001: 34, following Chaudenson 2001), with whites 
usually still constituting a majority of the population, and the later plantation phase. 
As was pointed out above, in Barbados the transition between these stages 
occurred around 1650; in Jamaica there never really was a full homestead phase, 
with the sugar economy reigning pretty much from the beginning of British 
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colonization. In smaller islands the situation may have been more variable. 
Emphasizing differences between these phases has important consequences for 
the creolization scenario: The traditional creolist position of basilects having been 
the original forms of Caribbean creoles and approximations to English resulting 
from decreolization coming in only later builds upon the conception of large 
plantations, while a more recent understanding of variability having been there 
from the beginning assumes that intermediate and less creolized forms grew 
during the early homestead phase and have been maintained, or at least highly 
effective and influential, because of the founder effect. 

As far as the nature of the social relationship between the races is concerned, 
we are again faced with a stereotypical image which probably covers the most 
important aspects of reality but which at the same time is in need of some 
reflection and qualification. The basic pattern was most likely one characterized 
by distance, resentment and even hostility between the races — reflecting the slaves’ 
legal status as property without rights and indicated by the constant fear of whites 
of slave uprisings. Clearly, this is a situation which maximally impedes commu- 
nication, i.e. it results in very limited and rather superficial language contact and 
is thus consonant with the creolization scenario. There were exceptions, however, 
although in numerical terms these probably remained relatively insignificant. Not 
all slaves were field hands — some were house servants (numerically speaking, 
in Jamaica one out of eight slaves was a domestic helper, according to Roberts 
1988: 8), others were trained as craftsmen, artisans, etc. (Alleyne, 1980: 184); and 
these roles typically would have entailed a wider range of encounters with whites 
and thus a higher degree of linguistic accommodation. While these hierarchical 
layers tended to be rigidly established and enforced, a monolithic conception of 
the social structure and realities of the Caribbean is certainly inappropriate. 

It is probably noteworthy that Emancipation does not seem to have changed 
social relationships and communicative patterns as much as one might have 
expected. By that time linguistic conventions had been more or less stabilized in 
most regions anyhow. Demographic proportions between blacks and whites were 
not greatly affected, and it may be assumed that roughly the same applies to 
linguistic habits in interethnic encounters. Thus the assumption that the end of 
slavery would have resulted in greater exposure to standard English seems 
misguided. Schooling certainly did have an effect along these lines, but it spread 
only slowly and not very effectively (and practically it only became more 
effective on a broader scale during the later twentieth century; Christie 2003: 13). 
The newly arrived Indian laborers, mostly in Trinidad and Guyana, apparently 
acquired the local vernaculars from blacks whom they worked with on the plan- 
tations, so they tended to adopt the creoles. 


4 Creolization and Decreolization 


The Caribbean is one of the world’s main regions where creole languages are 
spoken, so any account of language contact there overlaps to a considerable extent 
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with what we know or think about the emergence and evolution of creoles in 
general. The dominant theories and concepts of creole studies have typically been 
developed on the basis of Caribbean languages or have been applied to them. 
This has been a history of controversies, so we do not stand on firm ground here. 
However, on zooming in more closely it can be claimed that since the beginnings 
of creole studies as a recognized academic discipline, roughly in the 1960s, theory 
formation and creolists’ debates have revolved around the same issues and ques- 
tions, in various guises: How closely related are the creoles to their European lexifier 
languages? How much of African origin can be identified in the Caribbean creoles, 
especially in their grammars? And how can the structural similarities between 
historically unrelated creoles (with different lexical bases) and their European source 
languages be explained? To this last question essentially two types of response 
have been proposed: African substrate effects, or universal principles of human 
language in general. 

Universalist theories have tended to suggest that shared features of creoles go 
back to the effect of deeply rooted principles of language organization or even of 
human cognition in general, like tendencies toward simplicity or the avoidance 
of markedness which became activated in situations of natural second language 
acquisition. The most extreme suggestion along this line of thinking was Bickerton’s 
“bioprogram” hypothesis (1981), which proposed that preferences for certain struc- 
tures are genetically encoded because, phylogenetically, they offered a competitive 
advantage increasing chances of survival to pre-humans, and that it is only in 
plantation situations, when children are faced with insufficient linguistic input 
(which normally overrides the genetic code), that their genetic endowment 
shines through, as it were, and becomes fully effective. It seems safe to state that 
Bickerton’s theory, while having met with great interest and having triggered import- 
ant research, has simply not been confirmed by subsequent findings and is rejected 
by almost all creolists today. However, the related but much wider question of 
whether creoles can be identified as such on purely structural grounds is still around 
and has found another, equally controversial revitalization in McWhorter’s (1998; 
2000) suggestion that the three features of lack of inflectional affixation, tone, and 
derivational noncompositionality mark creoles as relatively young languages 
(with these properties only emerging through time in aging languages). In any 
case, it seems clear that the scenario of abrupt creolization by children, on which 
Bickerton’s theory builds so strongly, fails to adequately describe the realities 
on Caribbean plantations: As was stated above, the main strategy to secure the 
presence of slave labor, e.g. in Jamaica, was continuous importation from Africa 
rather than raising slave children on location. The number of children on a sugar 
plantation tended to be very small, certainly insufficient for them to constitute a 
community of creators of a new language (Roberts 1988: 111). 

Obviously, the substrate thesis builds strongly upon the idea of contact 
between African languages and English. Whether or not contact with long-term 
linguistic consequences had already taken place in Africa, i.e. whether a pidgin 
precursor to which Caribbean creoles can be traced back had already arisen on 
African soil, is equally disputed. The strongest proponent of this thesis was Ian 
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Hancock, who suggested that a “Guinea Coast Creole English” might have been 
the ancestor of these languages (e.g. Hancock 1993). More recently, McWhorter 
(1997) ventured to locate the major site of early contact and the genesis of a proto- 
pidgin specifically at the trading fort of Cormantin on the Gold Coast. The issues 
involved here are relatively speculative, as the precise historical evidence we have 
is even scantier than that concerning communication patterns on Caribbean plan- 
tations. In Africa, captured slaves were held and “seasoned” for a while, presumably 
up to a few weeks, in coastal forts until the next ship for the Middle Passage stopped 
by; and we know about the dreadful conditions and high mortality rates during 
the Atlantic crossing. All of this sounds not really conducive to even the most 
rudimentary language acquisition, so I am skeptical with respect to the idea of a 
pidgin having been imported to the Caribbean. But the question of where CECs 
ultimately originated seems actually not that important for weighing the African 
input in general, given that without doubt the African slaves brought their 
knowledge of their native languages with them into the contact setting. Creolization 
definitely can have happened on the Caribbean islands without any African-based 
prior pidginization, and actually this seems much more likely than it having taken 
place in Africa before transportation to the Caribbean. 

On the other hand, the fact that substrate influences did occur and probably 
played an important role seems all too obvious and difficult to dispute — even if 
some scholars have radically denied it and the exact amount and importance 
of African traces is difficult to identify. Early and influential proponents of 
the substrate view include John Holm and, most notably in his 1980 book 
which adopted a historical-comparative perspective and pointed out similarities 
between Caribbean creoles and African languages, Mervyn Alleyne (also, with 
respect to cultural traces, Alleyne 1988). Boretzky (1983) is a major scholarly study 
adopting the same strategy: formal and structural parallels between African 
and Caribbean language patterns on the levels of phonology and grammar are 
worked out systematically and in great detail, thus producing a strong case (by 
means of undogmatic assessments and carefully weighing differing interpretations) 
for substrate transfer effects especially of Kwa and Mande languages in the emer- 
gence of Caribbean creoles. Among the features for which he finds strong and 
suggestive parallels are the following: “vowel elision in Sranan, similar to what 
is found in Yoruba and Ewe” (Boretzky 1983: 57-8); “the conflation of progres- 
sive and habitual functions in preverbal markers, e.g. in Sranan de/e and in Yoruba 
n-” (Boretzky 1983: 130-8); “a lexical verb meaning ‘finished’ being grammat- 
icalized as a completive marker, especially in postverbal position, e.g. Sranan 
kaba and Yoruba ti/t6/ta” (Boretzky 1983: 137-8); etc. More recently, Migge (2003) 
and Huttar (2008: 217) convincingly argued for substrate effects on Suriname’s 
creoles (Ndyuka in the second case). Even more detailed cases for substrate 
influences have been made outside the domain of English-based creoles (though 
they may be taken to be indicative of the potential of such effects): Berbice Dutch, 
a creole of Guyana, has been shown to be very directly related to and motivated 
by Eastern Ijo. Claire Lefebvre (and others) have for a long time attempted to trace 
Ewe-Fon reflexes in Haitian Creole. The most recent assessments of the substrate 
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theory are available in Migge (2007) and in chapter 8 of Mufwene (2008); see also 
Singler (1988) and Mufwene (1993). 

The case of Berbice Dutch is so exceptional and convincing, among other reasons 
because of the homogeneity of the African input in that region. It is known that 
all the slaves there brought the same African language to the contact situation, 
something which of course provides for a compact and tightly circumscribed sub- 
strate, with relatively few potentially interfering factors. However, in most other 
situations this does not apply as the majority of slaves on any plantation tended 
to come from widely different regional and linguistic backgrounds. In fact, a widely 
cited assumption has been that plantation owners deliberately attempted to mix 
slaves from different tribal origins and linguistic backgrounds in order to con- 
sciously deprive them of the option of communicating with anyone else on the 
plantation in their own African languages (to make secret communication and 
thus insurrections less likely). There is probably some truth in that (and to this 
extent such a strategy would have had linguistic consequences, weakening direct 
transfer from any single African language), but it also seems questionable whether 
or to what extent such a policy was, or could have been, imposed systematically. 
On the one hand, it is not clear to what extent planters could or would really 
have cared about such a policy — when they were in need of field hands they prob- 
ably would have filled this need at the next available auction. Furthermore, the 
opposite consideration may have played a role as well: Slaves from certain 
regions were specifically sought for special qualities and skills. For instance we 
know, though the evidence is scant, that some African languages did not die out 
in the Caribbean: at occasional festivities slaves would have practiced some 
African customs and used their ancestral languages (though there is no way of 
telling if these were not rudimentary, formulaic or frozen patterns in the long run), 
in particular on Jamaica (see Cassidy 1961: 18; Lalla & D’Costa 1990: 2; Patrick 
2004: 407). There, the maroons have been credited with having retained much more 
of their African roots, including traits of language (though, again, it is not really 
clear what specific African languages were used for what purposes amongst the 
maroons; cf. Alleyne 1988: 125-6). 

Clearly the contact situation must have been such that field hands on 
Caribbean plantations heard some but relatively little English around them 
(mainly from overseers, occasionally from other whites). For new arrivals, the model 
variety in their language acquisition process was probably not first-language English 
as spoken by whites but rather a contact variety itself, as produced by other Africans 
who had been around for a longer period of time (Cassidy 1961: 19). In that pro- 
cess of adjusting to their environment and adopting a communicatively useful 
language form they picked up words and probably also patterns, generalized these 
through some internal process of linguistic hypothesis-making, and compensated 
for whatever structural or other linguistic input was not available to them by 
filling in patterns, using sounds or trying to use words from their native language 
knowledge — thus contributing to the available “feature pool” in a given com- 
municative setting and enriching the African component and increasing its 
linguistic usefulness. And some of these forms and features would have been 
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communicatively successful (either coincidentally or because other Africans 
around them shared similar underlying linguistic concepts from their own native 
languages) and were thus strengthened, maintained, and integrated into the local 
linguistic repertoire of the speech community. It makes sense to assume that the 
relative strength of the respective input factors (English, African, or whatever), 
determined by demographic proportions, interaction habits, and other components 
of the communicative setting, was ultimately decisive in shaping or at least 
influencing the local linguistic repertoire. 

Furthermore, it seems clear that a diachronic component is also important to 
consider here, in line with, though perhaps not exclusively restricted to, Mufwene’s 
idea of a “founder effect.” During the earliest stages of the formation of a new 
local speech community (i.e. one big plantation — on which most field hands would 
have had few if any outside contacts for most of the time) no established norms 
existed, so the impact of speakers and their features in these situations would have 
been stronger, less constrained. In the course of time, however, on the basis of 
the success or failure of certain linguistic forms, corroborative effects and thus 
habit formation would have generated a new, local linguistic norm, which in turn 
stabilized and was strengthened by more and more speakers tuning in to it. Those 
who came later already found such a norm, and for them it was more a matter 
of acquiring it or at least adjusting to it in the course of time (though it is clear 
that these people still contributed to the universe of linguistic forms surround- 
ing them and to the development and the shaping of linguistic habits in the speech 
community at large). 

It is also noteworthy that this process of creole formation and creole usage 
operated not only monodirectionally and did not involve African language 
acquirers only (though this is not to deny the huge disproportions here). We 
have reports stating that whites picked up creole as well and used it in appro- 
priate communicative situations (Holm 1994: 334-5). Most evidently this would 
have characterized the speech of poor whites and indentured laborers who 
also worked the fields, especially during the early, “homestead” phase of the 
seventeenth-century colonies, but not exclusively so. From the mid eighteenth 
century we have reports of both “ladies,” who were not sent to England for 
education, and of children (“for a Boy, thill the Age of Seven or Eioght, diverts 
himself with the Negroes, acquires ther broken way of talking,” reported from 
Jamaica in 1740) adopting the “gibberish” spoken by slaves (Miihleisen 2002: 63; 
see also Cassidy 1961: 21-3). The outcome of this can best be observed today 
on Barbados, where Blake (2002; 2004) studied black-white speech relationships 
and usage and found that poor whites use language forms which are definitely 
influenced by and similar to black people’s creole (or dialect). 

Such considerations, and a range of studies engendered by similar ideas, have 
tended to weaken “extreme” positions in recent years and to suggest more 
moderate compromise hypotheses which tend to deny the “exceptionalism” of 
creoles (DeGraff 2003) or even the status of creoles as separate languages, imply- 
ing that they are best considered varieties of, or at least closely related to their 
lexifiers (Mufwene 2000; 2001), in contradistinction to an earlier, widely discussed 
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theoretical stance by Thomason and Kaufmann (1988). In general, the recent trend 
has been to emphasize the contact-induced character of creoles and thus to put 
them in line with other varieties and languages which are known to have been 
strongly shaped by language contact (Thomason 2001; Winford 2003), but to some- 
how sidestep the issue of their purported status as new languages unrelated to 
their lexifiers. Winford (2000: 217) suggests that intermediate creoles should best 
be regarded as products of untutored second language acquisition and language 
shift, and as restructured forms of English. 

None of the major controversies in the field has really been solved, but the insight 
seems to be spreading that extreme positions fail to capture the whole range of 
observable phenomena. As to the origins question, the conference volume edited 
by Muysken and Smith (1986) has deepened our understanding of the issues and, 
to my mind, has made it clear that, in the words of the title of Mufwene’s con- 
tribution, “the universalist and substrate hypotheses complement one another” 
(1986): It would be difficult to deny that substrate forms and patterns are effec- 
tive, just like superstrate input has been, while “universal” and cognitive factors, 
together with social factors, apparently contribute to the selection of which of these 
forms will ultimately be successful in the formative process. Work by Arends (1993) 
and others has shown that creole evolution proceeds gradually and involves some 
restructuring by adult speakers rather than in an abrupt fashion and exclusively 
by children, an observation which largely invalidates the genetic endowment 
predicted by the “bioprogram” hypothesis. Against the earlier, implied assump- 
tion that only deep basilects are “real” creoles it has been shown and argued 
that creoles come in various “degrees of restructuring” — which can nicely be 
observed when comparing different regions in the Caribbean (Schneider 1990; 
Neumann-Holzschuh & Schneider 2000). 

Many of these observations can be convincingly accounted for in the more recent 
“uniformitarian” framework suggested by Mufwene (2001; 2008), which assumes 
that language evolution always operates under contact conditions, that speakers 
from different linguistic backgrounds bring their respective features to a contact 
situation, and from this “feature pool” some forms ultimately tend to be successful 
in an ongoing process of “imperfect replication” of forms in the speech commu- 
nity. These features are selected and reproduced while others die out. Mufwene’s 
framework claims to be valid for language evolution in general, even if it has been 
developed primarily by working on creole languages (including Caribbean ones). 
Essentially, the claim resulting from this perspective is that creoles and non- 
creoles are distinguished not in principle but only by different proportions of 
substratal versus superstratal features, respectively. And what we find through- 
out the Caribbean is very clear: varieties with variable degrees of distance from 
their European donor languages (and, conversely, from their African substrates 
as well). 

This issue of the “depth” of CECs has not only a synchronic dimension but also 
an important diachronic side to it, namely the conventional idea of present-day 
mesolectal varieties being products of “decreolization.” The traditional assump- 
tion in creole studies, still to be found in many handbooks, used to be that the 
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extreme contact situations on newly formed plantations quite abruptly produced 
maximally deviant basilects, which in the course of time, through increasing 
exposure to the standard varieties exerting pressure from above, as it were, were 
“decreolized,” i.e. they adopted standard language features piece by piece, gradu- 
ally replacing the erstwhile creole features. This results in “post-creole continua,” 
with typical examples found in Jamaica or Guyana, where a range of variants 
stratified from basilectal via mesolectal to acrolectal varieties can be observed. 
It is important to understand that in this line of thinking all non-creole forms 
are thought to be later intrusions into an erstwhile “pure” creole system, which 
in turn was assumed to have originated at the outset. Intermediate forms of 
Caribbean creoles which combine typically creole with typically acrolectal features, 
found very widely, have thus been interpreted as younger, mixed patterns, indi- 
cating the ultimate demise of the basilectal creoles. 

A characteristic example of this line of thinking is the debate on the nature of 
early Bajan on Barbados. Barbados is noteworthy in that its creole is obviously 
closer to English than that of Jamaica, for instance. Cassidy (1980) claimed that 
this situation is a product of decreolization, with early Bajan a full-fledged creole. 
In contrast, Hancock (1980) believed that a deep creole had never been spoken 
on Barbados, due to the island’s special topographic and demographic conditions. 
The matter was considered an open question until Rickford and Handler (1994) 
and Fields (1995) produced evidence showing that in earlier texts from Barbados 
more creole features can be found than today, thus suggesting a sequence of an 
original creole and a later, more decreolized variety. In contrast, Winford’s (2000) 
historical survey suggests a much more refined and differentiated social and lin- 
guistic scenario that has produced today’s situation without resorting to extreme 
positions. Note that the evidence produced by Rickford, Handler, and Fields, while 
documenting that more basilectal forms than today were in use, does not neces- 
sarily imply that this is the whole story, i.e. it does not rule out the possibility 
of variability at that earlier stage as well (as postulated by Blake 2004: 502, and 
conceded by Rickford & Handler 1994: 228; cf. Winford 2000: 223). 

The decreolization scenario, as suggested by DeCamp (1971) and Bickerton (1975) 
and modeled in their “implicational scales” framework, has been questioned increas- 
ingly, however. Rickford (1983) doubted whether it adequately accounts for the 
situation. Similarly, Le Page and Tabouret-Keller (1985) rejected the model as too 
simplistic, arguing essentially that the variation to be observed is multidimensional, 
caused by a complex range of interacting factors, rather than corresponding to 
the monodirectional post-creole continuum. Patrick (1999; 2004: 410) considers the 
theoretical status of the Jamaican mesolect and argues that rather than viewing 
it as a mixture between basilectal and acrolectal features, it has to be understood 
as a linguistic system in its own right. Mufwene’s framework, mentioned above, 
also denies the validity of what he calls “debasilectalization” and implies that in 
all varieties there has been graded variability from the outset. Historically speaking, 
the evidence available clearly shows that mesolectal forms predated the basilectal 
forms (e.g. past ben, pl. -dem, possessive fi, serial verbs, or the pronoun unu, all 
of which appear relatively late in the record; Patrick 2008: 149-50). 
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More generally, this ties in with the synchronic question of how the inter- 
mediate, less strongly creolized varieties, which can be found in many localities, 
but do not seem to be derived from a basilectal forerunner, can be explained. 
Schneider (1990) showed that with respect to the English-related varieties of the 
Caribbean there exists a “cline of creoleness” between “deep creole” varieties like 
Jamaica and Guyana (and especially the creoles of Suriname, which, however, were 
not the subject of that paper), via “intermediate” ones like Bajan to “light” vari- 
eties which display only a small fraction of the features typically considered 
characteristic of creoles, as spoken, for instance, on the Cayman Islands or the 
Bay Islands of Honduras. The line of thinking was continued in works such as 
Neumann-Holzschuh and Schneider (2000), arguing for the existence of “degrees 
of restructuring” more generally, or Holm (2004), who assumes the existence of 
“partially restructured” varieties. 


5 Products of Contact 


5.1 Characteristic features 


Especially on the level of morphosyntax the CECs (and related “lighter” varieties) 
have a number of properties in common that set them off as a special group of 
language varieties, even if many of them are shared with English-related creoles 
elsewhere. Most examples in this section are taken from the contributions to 
Schneider et al. (2004) and Kortmann et al. (2004). 

The most distinctive of all “creole features” are the so-called preverbal markers, 
short particles placed before the verbs expressing aspectual distinctions rather than 
tense relationships (as English does, mostly using suffixes). The precise semantic 
specifications of the meanings of these categories may vary from one variety to 
another, but basically the forms widely found (with some examples) include 
anterior (JamC! ben/men/wen, mesolectal did), progressive (JamC a/da/de; e.g. 
St. Vincent he mama a call im ‘his mother is calling him’), perfective /completive 
(JamC don, Sranan kaba), habitual (Gullah doz, Eastern Caribbean da/de, mesolectal 
doz/is; JamC is assumed to lack a distinctive marker here; e.g. Antigua e de see e 
breda ‘she sees her brother regularly’), and future/irrealis (JamC go/wi, GuyC go, 
sa; Rickford 1987). Combinations of these forms are also possible (e.g. JamC Mi 
ben a ron ‘I was running’, Barbuda she a go sing ‘she will be singing’, Trinidadian 
bina for past imperfective), and mesolectal varieties have forms recognizably closer 
to English (like TriC yuuztu for the past habitual). 

Characteristically, Caribbean creoles have distinct copula forms depending on 
their functions. Actually, no copula tends to occur before adjectives (as in JamC 
Him lucky, Trinidad De baby sick), so that, commonly, these are also analyzed as 
being equivalent to static verbs (and they also take preverbal markers). However, 
before nouns (in “equative” structures) a special form of the copula is typically 
found (da/a, e.g. JamC Ebry day da fishing day ‘Every day is a fishing day’). In 
addition, most creoles have a distinct copula form (mostly de or similar) before 
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locatives (e.g. im de de ‘he is there’; Barbuda we i de ‘where is he/she?’). Finally, 
Caribbean creoles can introduce sentences with a form of a copula followed by 
a constituent that is highlighted (and later repeated) in the sentence (e.g. Is tiif 
dem tiif it ‘It was certainly stolen’, Holm 1988/89: 179). 

Negation requires no do-support but is achieved by simply placing a negator 
in front of the verb, typically no/na or en (probably reduced from ain’t). 

Object clauses after speech act verbs or mental verbs are commonly introduced 
by the “complementizer” say/se(h) (e.g. JamC Him swear seh him was going to tell 
me ‘He swore that he was going to tell me’; TriC he tell me se he na come again 
‘He told me that he would not come again’) or, in Suriname, taki (from talk). 

In comparison with English, the inventory of pronoun forms tends to be reduced. 
For subject and object functions it is common to use the same pronoun form (e.g. 
mi ‘I, me’), and sometimes this includes the possessive as well. Furthermore, some 
creoles have reduced the gender distinction (though forms like im meaning 
‘she’ are found not that widely and primarily in rather deep basilects). A distinct 
second person plural form unu/wunnah, presumably African-derived, is also 
widespread. 

Creoles mostly lack inflectional morphology — again, most evidently in basilects. 
Pluralization may be achieved by forms like -dem (e.g. JamC di man-dem ‘the men’); 
possessive relations can be expressed by a simple juxtaposing of possessor and 
possessed or by patterns like Jam fi-dem (e.g. fi-me work ‘my job’). Verbs have no 
third person singular suffix, and usually no suffixes for the past tense either (with 
time relations being expressed by anterior markers, context, or lexical means). 

Serial verbs, a sequence of two or more verbs in the same predication, with or 
without an object interspersed, are another unique kind of construction, e.g. JamC 
Im tek naif kot me ‘He cut me with a knife’; Sranan mi seri a oso gi en ‘I sold the 
house to her.’ 

Finally, a totally different domain of linguistic expression, rarely ever explored, 
deserves to be mentioned here, namely creole pragmatics: in things like greeting 
practices, expressions of politeness, the social meanings of seemingly aggressive 
statements, and some aspects of gestural and nonlinguistic expression (like 
“suck-teeth,” a facial expression of strong disapproval) CECs have quite distinct 
properties which in some respects may ultimately be African in origin. Very little 
work in this interesting area of language contact effects has been done so far, with 
a volume edited by Miihleisen and Migge (2005) being the main exception and a 
solid beginning. 


5.2 The cline of creoleness 


Pending the still open question of what it actually means for a language to be 
“a creole,” i.e. whether creoles can be identified on the basis of their structural 
properties or only because of their shared sociohistories related to plantation 
environments, it seems true that many of the Caribbean English-related creoles 
are somehow “intermediate creoles” in the above sense. Actually, too much of 
the discussion has focused on the big and well-documented varieties of Jamaica, 
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Trinidad, or Guyana, and there is still room and an actual need for further field- 
work, analysis, and documentation. Initiatives like the ones by Aceto and Williams 
(2003), instigating investigations of “smaller varieties,” Aceto’s own work on such 
varieties (2003; 2004a; 2004b), or studies like those by Walker and Meyerhoff (2006) 
on the small island of Bequia are thus most welcome and much needed. 

The range of different varieties, in terms of their recognizable distance from or 
proximity to dialectal English, the superstrate variety, is amazing. Suriname’s cre- 
oles (cf. Migge 2003; Winford & Migge 2004; Smith & Haabo 2004) are decidedly 
those most distant from English, wholly corresponding to the notion of full and 
deep creoles. They have all the features associated with creoles, some lexical 
influences from African languages and also from Portuguese, and also heavily 
restructured phonotactics (e.g. bradi, bigi, langa ‘broad’, ‘big’, ‘long’) which renders 
many phrases difficult to comprehend for an outsider (e.g. A nyan kaba ‘she has 
eaten’). Jamaica and Guyana also have deep basilects, and both of them are char- 
acterized by the continuum situation discussed above (to which I hesitate to apply 
the traditional term “post-creole”; for an analysis of a literary representation of 
the Jamaican continuum see Schneider & Wagner 2006). In Guyana the positions 
on the continuum are ethnically stratified, with Indo-Guyanese speakers usually 
being more basilectal than Afro-Guyanese. The situation is similar in Trinidad and 
Tobago, where the Tobagonian Creole is more basilectal than the Trinidadian one 
(e.g. TobC i a teacha vs. TriC hi is teacha; James & Youssef 2002: 156). English-based 
creoles can be found on many of the other, smaller islands as well, including 
Antigua, Barbuda, St. Lucia, St. Vincent, Grenada, the Bahamas (Hackert 2004), 
and so on. As was mentioned above, Bajan, while clearly also a creole, is some- 
what “lighter” in comparison, with more restricted uses of preverbal markers, serial 
verbs not commonly in use, pronominal distinctions retained to a considerable 
extent, and also some inflectional endings in use. At the other end of the cline, 
there are some varieties which look essentially like dialects of English but do have 
occasional elements typically associated with creoles in use, like preverbal been 
or done, uninflected nouns or verbs, the complementizer say, etc. Most notably, 
this applies to the variety in use on St. Eustatius, Saba, Montserrat, the Cayman 
Islands, and to the Bay Islands of Honduras (where there is a distinction along 
the cline correlating with skin color; Graham 2005). Work by Wolfram, Reaser, 
Torbert, and others (e.g. Reaser & Torbert 2004) has shown that on the Bahamas 
there is also a gradual difference between white and black speech forms (for ex- 
ample, black but not white Bahamians vocalize word-final /-1/ in words like steal). 
Finally, there are also pockets of white dialects to be found in the Caribbean, 
arguably exposed to and influenced by creole structures. For instance, on 
Anguilla Jeff Williams studied a dialect spoken by people of European descent 
which has habitual do (it do be hot) or zero possessives (my daddy brother; Aceto & 
Williams 2003). 

This “cline of creoleness” illustrated above by listing different varieties can also 
be illustrated by looking at competing but functionally equivalent variants of a 
single variable, i.e. relatively basilectal, mesolectal, and acrolectal forms corres- 
ponding to each other. The first to do this, in a post-creole continuum framework 
was DeCamp (1971), and by and large we find such stratifications referred to in 
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some of the literature (cf. Schneider 1990; Winford 1997). Schneider (1998) worked 
this out systematically for negative patterns as spoken in various Caribbean vari- 
eties. The cline of forms available and sociostylistically stratified (with individual 
positions on the cline varying by region) ranges from preverbal no/na (e.g. shiy 
no waan de byebiy ‘she did not want the baby’, Panama) via en/eh (the girl eh know 
‘the girl didn’t know’, Trinidad), neva for single events (a neva evn sty it ‘I didn’t 
even see it’, Belize), doun/don (ai don nuo, Barbados) to upper-mesolectal didn (it 
didn useta be that way ‘it didn’t use to be that way’, Gullah) and dialectal English 
multiple negation (non a di pikni-dem neba si notn ‘none of the children didn’t see 
nothing’, Jamaica). 


6 Conclusion 


No doubt the Caribbean is a fascinating region in many respects, not only cul- 
turally but also socially and linguistically, characterized by a multidimensional 
complexity of linguistic choices, symbolic representations, and negotiations that 
are actually difficult to fully understand and document. While much of the early 
effort in the field went primarily into investigating the dominant creole varieties as 
spoken, say, in Jamaica, Trinidad, or Guyana, the situation has now improved in 
that many other localities and situations have become objects of linguistic investi- 
gations, thought there is still much need for groundwork to be done. Theoretically 
speaking, there has been a tendency in recent decades for scholarship to step back 
from earlier strong claims which saw creolization as a unique and highly excep- 
tional process, steered by whatever causes different scholars considered decisive, 
and to embed the study of creoles and creolization more broadly in the discipline 
and the parameters of language contact studies. In terms of types and outcomes 
of language contact, almost the entire range of contact conditions and phenomena 
as outlined by Thomason and Kaufman (1988) and Thomason (2001) can be found 
and investigated there. Clearly, for scholars interested in language contact this is 
a prime region to turn to and investigate more closely. 


NOTE 


1 JamC = Jamaican Creole; GuyC = Guyanese Creole; TriC = Trinidadian Creole; TobC 
= Tobagian Creole. 
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24 Contact and Asian 
Varieties of English 


UMBERTO ANSALDO 


1 Introduction 


There is little doubt that English has a very prominent position in Asian societies 
today. In Japan, Korea, China, and Taiwan, it is widely studied as a second or 
foreign language, and in Hong Kong, a subset of the population can actually claim 
a certain degree of Cantonese—English bilingualism. In Mainland Southeast Asia 
(Vietnam, Laos, Cambodia, and Thailand), there is a growing economy centered 
on the teaching of English as a second language (the “Expanding Circle” according 
to Kachru 1985). In peninsular Southeast Asia, especially Malaysia and Singapore, 
the role of English has been increasing to that of a strong second language and 
the language used by the government for international communication (these 
varieties would roughly fall within Kachru’s “Outer Circle”). And in Singapore, 
English is also one of the four official languages, as well as a first language (L1) 
of a significant part of the population (see section 3.3). English foreign language 
(L2) varieties also abound in South Asia (but are not touched upon in this chapter; 
see Hickey 2004a for an overview). 

Considering the range of functions covered in such different sociolinguistic 
contexts, it is difficult to talk about “Asian Englishes” in general terms, because 
there is the risk of lumping together what are in fact different social phenomena 
and different structural types (but see Kachru 1985 for some common historical 
and sociolinguistic traits of the region). Likewise, we should not be too quick in 
identifying common, or even universal properties of English varieties in different 
Asian societies, until we have a proper understanding of how each local context 
influences the typological and sociolinguistic profiles of the varieties in question 
(Ansaldo 2004; Lim 2007; 2009a; 2009b; Gisborne & Lim 2009). In this chapter 
I elaborate on these points by focusing on some properties of one Asian English 
variety, namely Singlish, which illustrate the role of language contact in the specific 
ecology in which it functions. These properties highlight the influence of “substrate” 
grammars in contact language formation, rather than focusing on “Angloversals” 
or other generic Asian English features (see Szmrecsanyi & Kortmann 2008). 
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The choice of Singlish — the most basilectal register of English spoken in 
Singpore by a majority of Singaporeans as first or second language — may be at 
first glance somewhat puzzling: Is it indeed the most illustrative example of an 
Asian English variety or is it actually the odd one out? The answer is both, depend- 
ing on our definition. Singlish is the only restructured Asian English vernacular 
to be thoroughly described to date, and it is the only variety of English acquired 
as primary language by its speakers. Other varieties, such as Malaysian English, 
have been so far mostly observed in terms of reductionist accounts (i.e. what is 
(not) there in terms of English grammar, see Baskaran 2008) and thus appear to 
reveal only a mild degree of restructuring (Platt & Weber 1980; Schneider 2007). 
While Hong Kong English may well display distinctive phonological features (see 
Bolton 2003 for an overview), there is very limited evidence to suggest that we 
are looking at a distinct Asian variety in grammatical terms (Schneider 2007: 138; 
but see Gisborne 2009). A clearly restructured variety can be found in Chinese 
Pidgin English, where abundant Cantonese grammar mixed with predominantly 
English lexical sources, but here we would be looking at a now extinct pidgin, a 
code once used in a restricted context, rather than a New Asian variety (Matthews 
this volume; Ansaldo, Matthews, & Smith, forthcoming). In a sense then, Singlish 
is not merely an L2 variety and is not representative of the use of English in most 
Asian societies. From the point of view of language contact, however, Singlish is 
the best possible case study, as it is the one variety of English that clearly recom- 
bines grammatical elements of other languages, i.e. Sinitic and Malay, into a novel 
grammatical profile (Ansaldo 2004). It is in this sense that, if we want to look at 
English in contact with Asian typologies, there is no better place to look than 
Singlish. 

This chapter is structured as follows. Section 2 sketches a theoretical setting 
within which English contact varieties can be studied. Section 3 reminds us of 
the ecology within which English was imported to Southeast Asia, in order to 
properly appreciate the types of contact environments that were formed prior 
to the development of Singlish. Section 4 focuses on some typical grammatical 
features of English that reveal the role of substrate typology in contact language 
formation. Section 5 extends these observations to other cases of English in contact 
in the Asian region, and draws a number of conclusions. 


2 Theoretical Prelude 


What are Asian English varieties (AEVs)? They are the product of the presence 
of English ecologies where other non-English, non-Standard Average European 
languages are also spoken. They can be cases of English L1 in contact with other 
languages, English L2 usage, or instances of English L2/L3 transmitted infor- 
mally within a linguistically diverse ecology (Mesthrie 2008). Basically AEVs 
are grammatical systems that arise through a cline of code-mixing that, if social 
circumstances are ripe, yields a novel grammatical system on which a speech 
community converges, rendering the system stable and focused (Le Page & 


500 Umberto Ansaldo 


Tabouret-Keller 1985; Ansaldo 2009). In this sense, studying AEVs falls under the 
study of language change, in particular contact-induced change, where contact 
between structurally very different languages takes place (Hickey 2004a; 2004b). 
As such, AEVs can be regarded as restructured vernaculars (Holm 2004) and, 
in order to be understood, require a theoretical approach to contact language 
formation. 


2.1 AEVs as cases of contact language formation 


Contact language formation — understood here as the various processes that lead 
to the evolution of mixed languages, Creoles, and pidgins — is, in general terms, 
a case of language change, or shift, which leads to structural differences between 
the input and the output grammars. Simply put, we can understand these types 
of changes in at least three ways: 


1 Change as departure from a norm, i.e. the result of emerging difference from 
this norm. This typically requires “negative” explanations in the sense of imper- 
fect acquisition or broken transmission. 

2 Change as system-internal: this usually implies that speakers are passive, and 
that changes happen to the grammar because of structural imbalances, system 
realignments, etc. In a radical interpretation of this, context does not matter 
in order to understand why change happens. 

3 Change as evolutionary, following other complex adaptive systems (Lass 1997). 
The reason for change is ecological variation and the mechanisms of change 
are selection and replication. 


In light of the fact that languages normally change over time, and that there is 
variation within speech communities of all types, interpreting changes to a gram- 
mar as indications of abnormal acquisition or transmission (i.e. sense (1) above) 
appears to go against common sense. In fact, even in cases of radical restructurings 
such as those observed in Creole languages, it has been pointed out that explana- 
tions that rely on exceptional circumstances are usually ideologically biased rather 
than empirically grounded (DeGraff 2001; 2003; 2005). Moreover, it has become 
clear that children alone are not enough for new grammatical systems to emerge, 
as their imperfect replications are constantly checked and corrected by adults, and 
it is only the latter who can really diffuse innovative structures. In relation to (2), 
historical linguists have time and again pointed out that it is speakers who 
change languages, and that explanations for language change lie ultimately in the 
social history of a speech community, not in its grammar (Thomason & Kaufman 
1988; Janda & Joseph 2003). While system-internal dynamics may influence the 
direction of change, these dynamics ultimately result from circumstances that are 
external to the grammar and originate in the history of the speakers. We can thus 
say that there is good reason to be very critical of (1) and very cautious about (2); 
we thus turn our attention to (3), which in essence is the role of context in contact 
language formation. 
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In the approach presented in the remainder of this section, the role of context 
is modeled in evolutionary terms for two reasons: ecological interaction is at the 
core of an evolutionary model of language change, as argued in Croft (2000) 
and Mufwene (2001; 2008); and an ecological approach allows for both internal 
ecology (strictly speaking linguistic) and external ecology (sociohistorical factors) 
to be studied within the same framework (Mufwene 2001), and thus offers the 
advantage of a uniformitarian model for the study of contact language formation 
(Mufwene 2001; 2008; Ansaldo 2009). 


2.2 Contact language formation in evolutionary terms 


It is important to appreciate that evolutionary theory may help us explain language 
change, not because languages are seen as biological systems, nor because language 
is seen as a genetic feature in the Chomskyan sense. Rather, the basic assumption 
underlying evolutionary approaches is that languages can be viewed as complex, 
adaptive systems in the sense of Dawkins (1976), Hull (1988), and Lass (1997). 
Following these pioneering works, an idea of language emerges which sees it, like 
other cultural systems, as evolving over time and involves modification as the 
result of (1) environment adaptation and (2) mutation. Note that the link between 
evolutionary biology and language change is an old one: the parallels go back to 
the use of the Stammbaum model to describe language speciation (see Mufwene 
2008), and are explicitly drawn out in Lass’ (1997) idea of language as an evolv- 
ing system. Within grammaticalization theory we note a strand that looks at the 
evolution of grammar as an instance of ritualization (Haiman 1994), a possibility 
already implied for example in Givén’s (1979a) inherently diachronic approach 
to grammar. But perhaps the most significant development in terms of evolutionary 
theory can be seen in the work of Hull (1988), who extends the basic assumptions 
of evolutionary theory to account for the development of all conceptual systems, 
in particular the development of scientific thought. 

Against this theoretical background, Croft (2000) proposes a model of language 
as a complex adaptive system, and Mufwene (2001; 2008) approaches the evolu- 
tion of new languages as products of differential ecologies. What these studies 
have in common is the recognition that languages exist in a context that is 
inherently variable, i.e. linguistically diverse; variability means that speakers 
have a certain degree of choice as to which variables, whether phonological, 
lexical, or syntactic, to use in a given instance. Croft (2000) recognized three types 
of variables: 


1 first-order variants: individual variants (phonological, semantic, syntactic 
etc.) that represent the natural individual differences in the same language; 

2 second-order variants: first-order variation that acquires sociological signifi- 
cance in a speech community: indications of social class, gender, profession, 
etc.; 

3 third-order variants: second-order variation conventionalized in a speech 
community: divergence (dialects or languages). 
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In the multilingual ecology which is typical of contact language formation, we 
can say that features of different varieties can be seen as being in competition 
with one another; it is the cumulative result of different choices, or different out- 
comes of competition, that explains why languages change and why speakers vary 
in their usage. Let us take a closer look at this. 

As argued most recently in Ansaldo (2009), most societies around the globe have 
been and still are multilingual. Note also that schooling and regulated language 
transmission are features of modern societies. This makes monolingual, norma- 
tive ecologies such as those that have existed in Europe for the past two centuries 
extremely marked from the point of view of human history. The type of language 
acquisition that happens in these ecologies is not “normal,” because for thousands 
of years this is not how languages were transmitted. If we want to understand 
how contact vernaculars such as AEVs develop, we must bear in mind that they 
evolve in multilingual ecologies in which some variety of English represents only 
one set of features available to speakers. In the same ecology other grammars are 
present, be they Chinese, Malay, Filipino, or Hindi. Grammatical features of these 
languages also play a role in the selection and replication processes, in line with 
Weinreich’s (1953) observation that in contact situations superficial multilingual- 
ism is enough for language interference. Therefore, we should not assume that 
English was the one and only target for nonnative speakers in the evolution of 
AEVs; rather, the speakers who contributed to the development of AEVs were 
busy selecting and replicating linguistic features from a pool within which 
English grammar constituted but a subset of choices available. This is not meant 
to suggest that speakers are entirely free from the constraints that their dominant 
languages impose on the acquisition of new ones; nonetheless, what I want to 
suggest is that speakers in multilingual ecologies do have a certain degree of agency 
in terms of communicative practices. It is also important to realize that normal 
transmission is untutored, creative, and involves more than one language in most 
colonial settings where AEVs emerge. This is clear when we look at the evolution 
of Singlish in Singapore. 


3 The Contact Ecology of Southeast Asia 


Southeast Asia, in particular the region that encompasses the Malay peninsula and 
the numerous islands that constitute the Indonesian archipelago, has been the hub 
of intense commercial activity since at least the tenth century CE. Already at that 
time, a maritime trading network based in Sumatra and known as “Srivijaya” 
profited from its central position in the spice trade between the Southern Indian 
Chola Empire and the Chinese Empire (Reid 1993). In these commercial exchanges, 
dictated by the regular and predictable patterns of the Monsoon winds, people 
from Arab, Tamil, Malay-Indonesian, and Chinese cultures came into contact and 
exchanged, besides goods, cultural and linguistic elements. Between the eleventh 
century and the fourteenth century a number of powerful cities arose in the Malay- 
Indonesian region in which merchants of different ethnic origin lived together for 
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the purpose of trade. These so-called “port-polities” (see Findlay & O’Rourke 2007) 
can be characterized as pluralistic societies with religious and cultural pluralism 
and pragmatic governance focused on the promotion of trade. 

The port-polity par excellence was undoubtedly Melaka (Arabic malakat, ‘market’), 
which can be seen as “the successor of Srivijaya and a predecessor of Singapore, 
as the major port-polity of the Southeast Asian region” (Findlay & O’Rourke 2007: 
134). Melaka was a pluralist system in which different religious, political, and finan- 
cial systems merged to accommodate a multicultural and multilingual population 
of traders. Pires (1515) reports that merchants (men and women) from over 50 
different countries were represented, at least 84 distinct languages were spoken 
in the city, and up to 500 hundred different money-changers could be found in 
just one street. Cultural diversity, including the linguistic pluralism and language 
contact that one naturally assumes in such an environment, was a long and estab- 
lished practice of this part of Southeast Asia (see Ansaldo 2009 for details) before 
the arrival of Western traders. Linguistically, we can see the results of the intense 
contacts between different cultures in the many trade languages, especially 
Malay-based, that developed during this time (Adelaar 2005; Ansaldo 2009). To 
this day many different forms of Bazaar Malay, in which Malay and Chinese 
elements mix, are spoken in harbors and major cities of the region, and many 
contact languages that we know of show traces of Pidgin Malay influences 
(Adelaar & Prentice 1996). 


3.1 Western visitors 


The Portuguese were the first Western arrivals to Asia, having established their 
hold on the Estado da India. They conquered Melaka (1511) and introduced into 
the region Portuguese vernaculars, which would in some cases be adopted by later 
Western arrivals as lingua francas, as well as African slaves. They were replaced 
by the Dutch East India Company (1641), which was largely responsible for an 
increase in the slave trade in the region, with subsequent population movements, 
forced migrations, and interethnic admixture (Ansaldo 2009). The British entered 
the Malay-Indonesian region as Dutch power started to decline, burdened as it 
was by military and administrative costs, conflicts with local potentates, and a 
general saturation of the spice market in Western Europe. The British had 
already gained control of much of the Indian subcontinent and had also started 
a lucrative country trade with China. In 1786 the British East India Company (EIC) 
obtained the lease of Penang Island in Malaysia; in 1819 the EIC acquired the island 
of Singapore. Together with Melaka, ceded to the British in 1842 as part of the 
Anglo-Dutch Treaty, these three locations formed the Straits Settlements, from 
which the British could compete against the Dutch. 

Thus English entered the ecology of Southeast Asia in at least two forms: as 
the varieties spoken by the traders, soldiers, and sailors, the descendants of those 
plunderers and pillagers that constituted the bulk of the early colonizers com- 
prising not only the British but all Western arrivals (Andrews 1984; Ansaldo 2009); 
and as a language of education (in part through mission schools), in order to 
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cultivate an English-speaking elite among the natives (Lim 2004: 3). This means 
that both standard and nonstandard varieties, cultivated and noncultivated regis- 
ters, grammatically correct and grammatically deviant patterns of British English, 
as well as the L2 English of non-Anglophone Western traders entered the linguistic 
ecology of the region. Here these varieties encountered different Asian languages, 
of which two stand out for political and demographic reasons: Malay-based 
trade languages and Sinitic (especially Min). 


3.2 The linguistic ecology of English in Southeast Asia 


In reconstructing the linguistic ecology within which English was introduced in 
nineteenth-century Southeast Asia, we have to assume — besides a widespread 
multilingualism due to prolonged practices of cultural admixture introduced in 
section 3.1 above — specific interaction with two types of languages in particular: 
(1) vernacular and contact Malay varieties and (2) Southern Chinese. It is obvi- 
ous from the above that other languages, such as Tamil and Arabic, were present 
in colonial Southeast Asia as languages of trade and interethnic communication 
and played a role in different types of environments. But for the purpose of the 
history of English in peninsular Southeast Asia, the big players were varieties 
of the two powers that had been in intimate contact since at least the fourteenth 
century: Malay and Chinese (Ansaldo 2009). 

In terms of Malay, it was contact varieties such as Baba Malay, which evolved 
within Sino-Javanese (Peranakan) communities during precolonial times, which were 
widespread languages of interethnic communication at the time. The Peranakans, 
descendants of intermarriage between mainly Chinese Hokkien males and Malay/ 
Indonesian women, had been prestigious traders for many centuries before the 
arrival of Western traders, and as multicultural and multilingual elites were used 
as middlemen between the British and the locals for the purpose of trade and 
politics (Ansaldo, Lim, & Mufwene 2007). Peranakan communities were indeed 
the dominant cultural groups of all three Straits Settlements. Varieties of Bazaar 
Malay, a contact language closely related to Baba Malay, were and are still spoken 
today throughout the region (Khin Khin Aye 2005; Ansaldo 2009). It is varieties 
like these, with a structural profile rather distinct from standard and/or literary 
Malay, that came into contact with English. 

On the Chinese side, the predominantly southern origin of the traders is histor- 
ically abundantly documented; one focal point of emigration to Southeast Asia 
was southern Fujian, the region of China facing Taiwan. People from this area are 
speakers of southern Min, also known as Hokkien in Southeast Asia. Other Sinitic 
varieties present in the ecology were Cantonese and, to some extent, Mandarin 
(see Ansaldo 2009). In several grammatical domains these Sinitic languages share 
sufficient traits for generalization from one variety to the other to be applicable, 
though this is not always the case. In considering the possible influence of language 
contact in the evolution of a language like Singlish, it is first and foremost to 
varieties of contact Malay and Southern Sinitic that we need to pay attention, as 
will become clear in the rest of this chapter. 
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3.3. Singapore 


As a strategic location for the China-Southeast Asia trade, Singapore already had 
a heterogeneous population before 1820, which consisted of over 5,000 inhabitants. 
These included Malaysian Malays, Javanese, Balinese, and Buginese from 
Indonesia; Indian Tamils, Malayalees, Punjabis, and Sikhs; as well as Hokkiens, 
Teochews, Cantonese, Hakkas, and Hainanese from China; mixed groups such 
as the Peranakans and the Eurasians were also represented (Lim 2004; 2007; 2009b). 
While initially the largest proportion of the population were “Malays” (from 
various parts of Malaysia and Indonesia) followed by the Chinese traders, 
over time the Chinese communities grew rapidly in number to comprise half of 
Singapore’s population by the beginning of the twentieth century. Throughout 
this entire period, i.e. the nineteenth century until around the 1970s, a form of 
Bazaar Malay was used as the primary interethnic lingua franca, with Hokkien the 
main intra-ethnic lingua franca for the Chinese (Lim 2007). Over time, English 
grew as a medium of instruction, gradually replacing instruction in other languages 
(in particular Malay and Mandarin) and established itself as a new language of 
interethnic communication. This led to the evolution of another contact phenomenon 
in Singapore: Singapore colloquial English (Platt & Weber 1980; Gupta 1994; Lim 
2004), also known as Singlish, a variety in which Sinitic, Malay, and English gram- 
mars blend to form what is de facto one of the two languages most Singaporeans 
today grow up speaking (the other being Mandarin, Ansaldo 2004; see also Bao 
2001; 2005; Bao & Lye 2005). 

For the purposes of this section, it is noteworthy that Singapore’s ecology, though 
still very multilingual, has changed over time from being highly multilingual at 
the societal and individual levels, with a strong dominance of Malay and Sinitic 
features, to largely English (in various registers) and Sinitic (Mandarin, Hokkien, 
and Cantonese are all represented), with a decrease in the usage of Bazaar Malay. 
This can be seen as a general trend in the postcolonial history of the region, as a 
consequence of the introduction of standardized and global languages (see Lim 
2007 and 2009b for a detailed account of Singapore’s linguistic ecology). However, 
as documented in Khin Khin Aye (2005: 24), Singapore Bazaar Malay is still alive, 
mostly used by elderly Singaporeans of Malay, Baba, Indian, or other non-Malay 
origins, when other common languages are not available, while in the younger 
generations of Singaporeans, Singlish functions as a language of interethnic 
communication. 


4 Grammatical Features of Singlish 


In what follows I approach the study of Singlish grammar from a typological 
perspective, in line with the assumption that it is the degree of typological 
congruence between the grammars in contact that has the strongest influence for 
the outcome of contact language formation (this is formulated as the “Typological 
Matrix” approach in Ansaldo 2009; see also Mufwene 2001; Aboh & Ansaldo 2007). 
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4.1 Agglutinative versus isolating typology 


From a typological perspective, the best way to approach Singlish grammar is to 
recognize a fundamental difference between, on one side, English and, on the 
other side, Sinitic and Malay. While English is classified as a mildly agglutinative 
language, meaning that it makes a certain use of morphology for grammatical 
purpose, Sinitic and Malay, the dominant languages of Singapore’s linguistic 
ecology, are typically isolating, meaning that there are hardly any morphological 
processes to speak of. This is true of many creole languages of the Caribbean too 
as well as other contact languages, a fact that has led a number of scholars to argue 
that morphology is “lost” in the histories of these languages. However, as already 
pointed out by Givoén (1979b), there could be another explanation: that the 
languages that evolve out of specific contact situations inherit the morphology of 
the so-called substrate languages (see also Ansaldo 2008). In the case of Atlantic 
creoles, these would be predominantly some West African languages of the 
isolating type. In the case of Singapore, as we will see, it is indeed in the nature 
of the non-English languages that much grammar finds explanation. In order to 
illustrate this, I focus on the following features: (1) zero-copula, (2), predicative 
adjectives (or property verbs) (3), topic prominence, (4) aspectual categories, 
(5) reduplication, and (6) utterance particles. These are not the only properties of 
Singlish grammar but they constitute typical properties derived from the inter- 
action between Sinitic, Malay, and English in the contact ecology. I therefore do 
not mean to claim that all Singlish grammar is derived from non-English features; 
what I aim to show is that a substantial amount of Singlish grammar is not English. 
In an evolutionary approach this should be our default assumption; and as a con- 
sequence, adstrate or substrate research should be of primary concern in the study 
of AEVs, and universal or Angloversal explanations should be based on careful 
scrutiny of the typologies in contact (Sharma 2009). 


4.2 Zero, verbless structures, and topic prominence 


Consider the following sentences: 


(1) a. She good a21? ‘Is she good?’ 
b. Careful, laksa very hot. ‘Be careful, the laksa is very hot.’ 


A normative English perspective would suggest that the copula is missing. This 
occurs very frequently in basilectal Singlish - why is that? One could think that 
Singlish is incorrect vis-a-vis English. However, from a typological point of view, 
zero-copula is a common feature of many languages of Asia and beyond. In such 
languages, it is typically related to another feature commonly found in isolating 
languages, namely the absence of a clear distinction between the word classes we 
know as “Verbs” and “Adjectves.” In Sinitic and Malay alike, we find indeed that 
copula verbs are rare and usually used for emphasis (Goddard 2005), and that 
“adjectives” behave predicatively, and are usually referred to as predicative 
adjectives (Li & Thompson 1976) or property verbs (Wetzer 1996): 
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Zero-copula for 
predicative 
nominals and 
adjectives 


(2) a. Keoi5 hou2 leng3 (Cantonese) 
b. Dia banyak cantik (Malay) 
S/he very pretty 
‘She is very pretty’ 


This means that predication is expressed without the need of a copula, so sentences 
like “He good” are grammatical, because good is actually interpreted as a predicate. 
Zero-copula and predicative adjectives are common typological correlates cross- 
linguistically, as illustrated in Figure 24.1. 

In fact, they relate to a third property of Singlish, which is topic prominence 
(Ansaldo 2004; Bao & Min 2005): 


(3) a. Today weather very hot wat. “Today’s weather is very hot, as you know.’ 
b. NUS, NTU everything you read must say global. ‘Regarding NUS and 
NTU, whatever you read about them has the word “global.” ’ 


There is good reason to believe that languages can be classified as either topic- 
prominent or subject-prominent (Goddard 2005), the latter often being the prevalent 
option in languages of Europe. Topic prominence on the other hand is typical of 
isolating languages since they have little morphology to carry out agreement tasks, 
often used to identify subjecthood. In English, subject and topic often align, but 
in topic-prominent languages, other grammatical elements can be topics, as seen 
in the following examples: 


(4) a. Lionel met (him) already ‘Lionel I already met’ 
b. Expensive the Durian here ‘Durians are expensive here’ 
c. That book got already ‘I already have that book’ 
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Topic prominence is a strong feature of all Sinitic languages, as shown in (5a—c) 
for Mandarin, as well as Malay.’ 


(5) a. Zhanglsan1 yi3jing1 jian4-guo le 
John already see-EXP CRS 
‘Zhangsan I have already seen’ 
b. Shang4 ge yue4  tian1qgi4 feilchang2 men1 
last © CLmonth weather extremely humid 
“Last month the weather was extremely humid’ 
c. Xiang4  bi2zi chang2 
elephant nose long 
‘Elephants have long noses’ 


The three features discussed so far, especially zero-copula and topic prominence, 
are clearly instances of replication of Sinitic (and Malay) grammar. In an evolu- 
tionary approach, this means that speakers select these features instead of, say, 
agreement marking and copular verbs. This does not really mean that speakers 
consciously choose them; in this case, the selection is most likely motivated by 
the typological pressure that the structural congruence between Malay and 
Sinitic exercises in the contact environment, as illustrated in Table 24.1. 

As we can see from Table 24.1, Sinitic (Hokkien) and Malay (Bazaar Malay) 
adopt the same strategies in all three instances, while English diverges. In the 
linguistic ecology of Singapore, the Sinitic-Malay features win in the com- 
petition process because they have higher type-frequency in the multilingual 
context in which speakers of Singlish communicate (Ansaldo 2009). Note how- 
ever that, as is often the case in the study of contact language formation, Sinitic 
emerges as a very likely, but not exclusive source of grammar in Singlish. As 
shown in Hickey (2007), topic prominence is a common feature of Irish English, 
and zero copula is found in the south-east of Ireland. This suggests that there is 
more than one route in the evolution of new grammatical material. In the case 
of copula, it is often observed that, in contact-induced language change, zero- 
copula is the preferred choice, probably due to the fact that copula is a some- 
what redundant feature of language, as illustrated by the examples in (1) and 
(2) above. 


Table 24.1 Distribution of features in the ecology 


Sinitic Malay English 
Zero-copula Zero-copula Copular verb 
Property verbs Property verbs Verb/adjective distinction 


Topic prominence Topic prominence Subject prominence 
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4.3 Aspect not tense 


A similar situation occurs in the marking of tense and aspect. This is also related 
to the isolating typology of the languages in question; note that the category of tense 
is most often marked in affixation to a verb stem. Obviously, where languages 
are isolating, and do not use morphology for the expression of grammatical 
categories, other means must be employed. Languages like Chinese and Malay 
do not really mark tense, apart from indicating relative tense with time phrases 
such as ‘yesterday’, ‘last night’, etc. On the other hand, they mark aspect, i.e. the 
relative temporal structure of an event, such as continuous in English. One of the 
most basic aspectual distinctions in languages of the world is the one between 
perfective — i.e. finished, completed, closed —- and imperfective — i.e. ongoing, repet- 
itive. Along these lines, we find that Singlish uses time phrases to indicate tense 
and regularly marks aspect. The phrase “last time” has clearly grammaticalized 
to indicate past tense: Lastime got mango you know ‘We used to have mangoes here’. 
More importantly, Singlish often expresses aspect rather than tense. A number of 
different aspectual markers can be found, among which the most grammaticalized 
ones are: perfective (6a), durative (6b), and habitual (6c) (see Bao 2001; 2005): 


(6) a. Oh, they go already ah? 
‘Oh, they have already left?’ 
b. They still give my hoping /ah. 
‘They still give me hope.’ 
c. Always seated at the cashier old lady you know. 
‘You know, the old lady (who is) always seated at the cashier.’ 


Note that aspectual systems are robust features of Sinitic and Malay alike 
(Goddard 2005). In both languages we find a basic distinction along the lines of 
perfective /imperfective; within the latter, usually a number of different states can 
be distinguished, including progressive, iterative, durative, and habitual (Goddard 
2005). Sinitic languages such as Mandarin, Cantonese, and Hokkien all realize these 
aspectual categories in comparable ways (Li & Thompson 1976; Bodman 1955; 
Matthews & Yip 1994; Goddard 2005); colloquial varieties of Malay such as Bazaar 
Malay likewise give prominence to Aspect rather than Tense (Khin Khin Aye 2005). 

This situation sketched here is comparable to the one described in Table 24.1; 
the typological congruence between Sinitic and Malay wins over English struc- 
tural features; this is the reason why we do not find much Tense morphology in 
Singlish and why aspectual markers have developed. What should be clear is that 
the isolating typology of Singlish, i.e. the lack of morphological material, is not 
a loss due to problematic transmission of English. As we have seen, Singlish has 
other grammatical categories in place instead of the ones usually expressed by 
nominal and verbal morphology in English. Rather, the lack of morphology, as 
well as the innovative grammatical features of English, are the result of typolo- 
gical congruence between two isolating grammars, namely Sinitic and Malay, 
which are and were present in the formative stages of Singlish grammar. If the 
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suggestions in this chapter are correct, then we can say that speakers are sensi- 
tive to the dominant grammatical patterns and tend to choose congruent, type- 
frequent features in constructing their grammar. 


4.4 Reduplication 


Another area of interest is reduplication which in Singlish includes different 
patterns with different functions. In Ansaldo (2004) and Wee (2004), at least four 
patterns are identified, illustrated below: 


1 N-N for intimacy: this my girl-girl = ‘this is my little girl’ (affectionate, not very 
productive) 

2 V-V for attenuation: just eat-eat lah = ‘eat a little’ (or pick some) 

3. Pred.Adj-Pred.Adj for “intensification”: his face red-red = ‘really (quite) red’ 

4 V-V-V for durative: we all eat-eat-eat = ‘keep eating/eat a lot’ 


Reduplication is a rather productive morphological process in Singlish, unlike 
English which makes very little use of it. Note that both Malay varieties and Sinitic 
varieties show reduplication patterns; however there is great variation in re- 
duplication strategies within Sinitic and Malay and, when looking for possible 
sources, one should be careful in making generic parallels with Mandarin or 
Standard Malay as these were not the most prominent varieties and may not have 
directly influenced the grammar of Singlish. In Sinitic, both Yue and Min varieties 
allow for adjectival and verbal reduplication of most word classes (cf. Matthews 
& Yip 1994: 44-8; Tsao 2001; Ansaldo NIR). In southern Min, for example, we 
find structures of the following type: 


(7) a. saan kin — sue-suechieng khi 
clothes quick wash-wash clean 
‘wash the clothes very quickly’ 
b. in khin khin khiam khitim teh kue_ jit ci 
3PL thriftily-thriftily DUR pass day 
‘they lived very thriftily’ 


Contact Malay varieties such as Cocos Malay show predicative adjectival (focus- 
ing) and verbal reduplication (casualness) (Ansaldo 2009): 


(8) a. ikan ini besar-besar 
fish(es) DEM big-big 
‘this fish/these fishes are rather big’ 
b.  saja jumput untok minum minum 
I invite for drink-drink 
‘T invite you for just a drink’ 


Finally, as indicated in point (4) above, we also find instances of triplication in 
Singlish (Ansaldo 2004; Wee 2004) which express continuity. Triplication, ice. 
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reduplication patterns where the base is copied twice, is in fact a typologically 
rather rare phenomenon (cf. Blust 1999). Interestingly, southern Min varieties do 
allow triplication (Tsao 2001: 305; Ansaldo 2004; Wee 2004): 


(9) a. peh peh peh 
white-white-white 
‘extremely white’ 
b. i kin-a-jit ching kah  sui-sui-sui 
3SG today dress COMP pretty-pretty-pretty 
‘she is dressed up very beautifully today’. 


Though the structural parallel is weak, because in Singlish triplication is verbal, 
while in Min it is obviously restricted to predicative adjectives, we must recall 
that Hokkien, a southern Min variety, was the most dominant Sinitic substrate 
in the early ecology of Singlish (cf. Lim 2007). This means that the possibility of 
an early substrate influence cannot be discounted. As argued in Ansaldo (2004), 
triplication does not seem to occur in other restructured English varieties. If inter- 
nally motivated change were the only explanation for triplication in Singlish, 
we would expect this to have occurred in at least some other case. The fact that 
triplication exists in the most significant adstrate of Singlish, however, explains 
why this feature would appear in Singlish and apparently no other English-based 
varieties. 

When we compare a number of significant languages in the multilingual 
ecology of Singlish, the picture in Table 24.2 emerges. As this table shows, there 
is a close match between the patterns of reduplications and their functions in Singlish 
and Hokkien which seems to account for the evolution of these patterns in 
Singlish. Moreover, we need to recognize the relevance of Hokkien in the 
emergence of Singlish reduplication as opposed to a more vague “Chinese.” 
In particular we can see the importance of correctly identifying the variety of 


Table 24.2 Reduplication patterns in the Singapore ecology (adapted from 
Ansaldo 2004: 134) 


Type/Function Singlish Bazaar Malay Min (Hokkien) Cantonese 

Nominal Intimacy Plurality Intimacy Intimacy 

Verbal Attenuation Attenuation / Attenuation Attenuation/ 

continuation continuation 

Predicative Intensification Intensification* Intensification Intensification 
adjective 

Verbal Continuity NA Continuity NA 
triplication 


*Note that in Colloquial Malay varieties the attenuating function is also found. 
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substrate in our case as it provides us with additional, illuminating information. 
The transfer of a whole (sub)set of grammatical features from one language to 
another has been described in particular in work by Matras (2000) under the label 
of “categorial fusion.” In short, categorial fusion suggests that tightly organized 
systems (or paradigms), in particular those with high discourse prominence, can 
and will be transferred from one language to the other in toto in contact situ- 
ations. In the case of Singlish, a striking instance of this is found in the domain of 
utterance particles, the majority of which are borrowed, including tonal features, 
from another adstrate, namely Cantonese. Though these are given a most com- 
prehensive treatment in (Lim 2007), I briefly sketch them in the next section. 


4.5 Utterance particles 


The study of utterance particles in Singlish nicely illustrates how AEVs evolve in 
different ecologies and through different adstrate influences. Table 24.3 summar- 
izes the treatment of utterance particles in Singlish presented in Lim (2007; 2009a). 
As illustrated, the evolution of particles in Singapore English clearly follows from 
the dominance of different languages in the different periods, first Malay and 
Hokkien, then Cantonese. Note that tonal features are also transferred into 
Singlish; it is also important to remark on the fact that the Sinitic language with 
the richest system of utterance particles, namely Cantonese, has the most substantial 
influence in Singlish. The approach presented in Lim (2007; 2009a) underlines the 
fact that AEVs, as contact languages, are continually evolving as an adaptation 
to the changing environments in which they function. 


Table 24.3 Ecology and origns of particles in Singlish 


Period Ecology Singlish particle Origins 
1800s—1950s_ 1. Malay, Hokkien as lah, ah Malay or 
lingua Francas Sinitic Unclear 
2. Southern Sinitic wat21* 
3. English 


1956-1980s 1. Rise of Chinese / Mandarin 
in education 
2. English becomes lingua 


franca 
1980s—2000 Increase in presence of lor33, hor24, leh55, Cantonese 
Cantonese me55, ma33 


* Successive periods layer over preceding ones. The two digits following a particle 
indicate its tone, represented as pitch-level numbers where 5 is a high pitch and 1 is 
a low pitch. 
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5 English in Other Asian Contexts 


The features described in detail in section 4 are by no means features exclusive 
to Singlish (see Hickey 2004b: 569), but represent some of the most typical fea- 
tures of Asian Englishes, as can be seen by their presence also in Malaysian English 
(see Platt & Weber 1980: 173-92; Kachru & Nelson 2006: 191-2). This is to be 
expected, considering that the linguistic ecology of Malaysia is to a large degree 
comparable to that of Singapore; in both cases Malay and Hokkien have played 
prominent roles as contact languages, albeit with different weight. Moreover, 
similar features are found in Asian—English contact varieties with similar typo- 
logical influences: the role of Sinitic (Cantonese) can be clearly seen in Chinese 
Pidgin English, which displays topic-comment structure, zero copula, and attri- 
butive predication.’ The fact that reduplication is rare in Chinese Pidgin English 
(Ansaldo 2009) again corroborates the fundamental role of typology in contact 
language formation. Note that in Cantonese, the Sinitic language that contributes 
most grammar to Chinese Pidgin English, reduplication is not as common as in 
Hokkien, the dominant Sinitic language of Southeast Asia. Moreover, Malay, 
a language in which reduplication patterns abound, is not at all present in the 
ecology of Chinese Pidgin English, but plays of course a role in the evolution of 
English in Malaysia and Singapore. 

Not surprisingly the use of English in China and Hong Kong has been reported 
to illustrate similar characteristics. To be sure, in the case of Hong Kong English 
we may be looking more at practices of code-mixing than at actual varieties of 
contact languages, since it is doubtful that English in Hong Kong, for example, 
can be captured as a restructured vernacular with unique grammatical properties 
(cf. Bolton 2003). Nonetheless, the phonology of Hong Kong English at least clearly 
shows Cantonese influence, and final particles as well as tones are transferred from 
Cantonese to Hong Kong English (Luke & Richards 1982; Hung 2000; Fox, Luke, 
& Nancarrow 2008). 

It is true that in all AEVs we also find far more Standard English features than 
the ones described below; this is to be expected in situations where lectal con- 
tinua characterize speakers of different social and educational backgrounds. 
Thus in more standard registers of English in Singapore we will find features such 
as wh-movement, number and tense morphology. But when these features make 
their presence known in the basilectal, nativized variety, it is usually because 
of register variation, and the ability of speakers to accommodate to what they 
perceive as more Standard English speakers. 

Rather than presenting a list of features that might appear to be common in 
the usage of English in Asian contexts, I have chosen to focus on the typical 
features of the restructured, basilectal variety of English spoken in Singapore. 
I have done so because the typological and sociolinguistic parameters of English 
usage in Asia are far too diverse and too variable to be impressionistically 
described as related phenomena. What I hope this approach makes clear is that 
we should study each Asian English variety as a unique product of its ecology, 
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and do so from a sociolinguistic and typologically informed perspective, in 
which the grammars in contact are taken into account and reveal the reasons for 
similarities and differences among the different varieties (see Gisborne & Lim 2009). 
Before engaging in macro-comparative approaches of alleged universal features 
of New English varieties, it is advisable to know exactly what is being compared. 
Important questions to ask, for a methodologically sound framework, include: What 
is the social profile of the English being investigated? Is it English L2 usage or 
a basilectal variety with endonormative tendencies? What are the varieties 
influencing this usage besides different types of English? Do their typologies show 
congruence or conflict with English grammar, and how does this affect usage? 
Answering these questions to me has the interesting potential of revealing what 
features appear in grammar and why. Furthermore, these questions promote 
our understanding of contact-induced processes, including substrate influence, 
congruence, simplification, and restructuring. 


NOTES 


1 Mandarin is not the best source of comparison as it was until recently not dominant 
among Sinitic varieties in the ecology of Singapore. I allow myself to use it here only 
because of the fact that Topic-prominence does not, to the best of current knowledge, 
show much variation across Sinitic languages. This is different in the case of redupli- 
cation, as shown in section 4.4. 

2 Aspectual categories surface with remarkable similarities across different languages, 
and can also be found in non-standard varieties of English; in Hickey (2007) for example 
we find comparable aspectual markers to the ones described here for Singlish, namely 
progressive, perfective, and habitual. 

3 In many AEVs we also find the use of got or have for existential predication, as 
‘to have’ or ‘to exist’, clearly transferred from either Malay or Sinitic (and possibly con- 
gruence between the two in some ecologies, see Ansaldo 2004). 
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25 Contact and African 
Englishes 


RAJEND MESTHRIE 


1 Background to Anglo-African Contact 


Although the topic of African Englishes is a potentially vast one, several factors 
make it a manageable and coherent theme. One is that the main influence of English 
has been in sub-Saharan Africa, rather than the more northern countries in which 
French and Arabic predominate as cultural and contact languages. Formal British 
colonialism touched mainly West and Central Africa and almost all of East and 
Southern Africa in the establishment of colonies and protectorates. This chapter 
assesses the role of contact in the development of English in these places, in relation 
to other significant factors. 

According to Spencer (1971: 8) English was probably first taken to Africa in the 
1530s when William Hawkins the Elder passed there on his way to Brazil. This 
would have been a form of Elizabethan English. A regular trade in spices, ivory, 
and slaves began in the mid 1500s when British ships sailed along the Guinea 
coast (Schmied 1991: 6). European forts were subsequently established along 
the West African coast. Pidgin Portuguese was the earliest form of a European 
language used there. As British supremacy in trade gradually grew, English began 
to gain a foothold. During this time West Africans were taken in small numbers to 
Europe to be trained as interpreters. An account in Hakluyt (1598-1600, vol. 6) 
cited by Spencer (1971: 8) suggests that by 1555 five West Africans had been taken 
to England for over a year for this purpose. Within Africa the earliest contacts 
between English speakers and the locals were informal and sporadic. There 
was no expectation of a permanent settlement or of colonization (and therefore 
formal education) until centuries later. In this first phase pidgins and “broken 
English” (i.e. early-fossilized interlanguages) were the outcomes of contact. 
These would not necessarily be ephemeral: West African Pidgin English whose 
roots lie in the seventeenth and eighteenth century is today more widespread 
(in the Cameroons, Ghana, and Nigeria) than is English as a second language. 
Pidgin English was not the only code used, as the African interpreters returning 
from training in England would probably have used English as a second language 
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(ESL) rather than pidgin. Two later influential varieties in West Africa (from about 
1787 onwards) were the forms of creole English spoken by manumitted slaves 
who were repatriated from Britain, North America, and the Caribbean. Krio was 
the English creole of slaves freed from Britain who were returned to Sierra 
Leone, where they were joined by slaves released from Nova Scotia and Jamaica. 
Liberia was established in 1821 as an African homeland for freed slaves from the 
US. The creole English that the returnees brought with them was most likely related 
to African American Vernacular English (Hancock & Kobbah 1975: 248, cited by 
Todd 1982: 284; Singler 2004). Today, American rather than British forms of English 
continue to dominate in Liberia (see Singler 2004). Todd (1982) describes four types 
of English in West Africa: pidgin; ESL; Standard West African English (mostly 
oriented to the UK, with the exception of places like Liberia), and Francophone 
West African English. 

In Southern and East Africa, by contrast, no pidgin English emerged, as other 
important lingua francas came into being at different times — e.g. Swahili in East 
Africa and Fanakalo, a Southern African pidgin in which Zulu is the main 
lexifier. In addition urban forms of certain languages like Town Bemba in 
Zambia and youthful mixed forms like Sheng in Kenya play a role in cutting across 
ethnicities. 

Four traditional language phyla are identified within Africa: Afro-Asiatic, Niger- 
Congo, Nilo-Saharan, and Khoe-San. Of these phyla the second is of key relevance 
to English studies, as it accounts for most cases of “Afro-Saxon bilingualism” (my 
term for the coexistence of an African language with English in societies). Niger- 
Congo itself is made up of many subfamilies, the best-known and most influen- 
tial one being the Bantu languages. These are the most widespread languages in 
key territories like South Africa, Zimbabwe, Lesotho, Botswana, Zambia, Malawi 
(in Southern Africa), Nigeria, Ghana and Liberia (West Africa), the Cameroons 
and the DRC (Central Africa), and Kenya, Uganda and Tanzania (East Africa). It 
would appear that, for demographic reasons, it is the Bantu substratum that has 
had the most influence on English in Africa. However, a comparative study of 
the influence of non-Bantu languages, especially of West Africa would be desir- 
able before one could be conclusive about this. This chapter assumes — as do most 
scholars — that it is possible to speak of sub-Saharan English as a typological entity. 
Henceforth I will use the abbreviation SSE for this loose grouping of English as 
a second language in this territory. 


2 Contact versus Non-Contact Effects 


This chapter concurs with most previous writing that contact is a major ingredient 
in the formation of SSE. However, it is important not to prejudge the influence 
of contact, and to examine other possible factors that may have shaped SSE 
significantly. One consideration is that English was often introduced in the class- 
room, rather than via the sizable presence of its L1 (first language) speakers. The 
learning of English was therefore a relatively controlled process, in which the 
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influence of the primary context (education rather than contact with speakers) could 
be expected to play some role. Janina Brutt-Griffler (2000) coined the term “macro- 
acquisition” for this kind of controlled group learning of a superstrate. To my 
knowledge no close empirical study has been done of the unfolding of classroom 
English of this kind. This would require a longitudinal study of a group of students 
undergoing “macro-acquisition” to describe and understand its characteristic 
processes, in relation to stages of acquisition, nature of the input, etc. in situations 
in Africa and Asia where native speakers have largely departed. In the area of 
phonology there is a common assumption in the literature on “World Englishes” 
that the introduction of English via reading and writing has resulted in “spelling 
pronunciations.” If this hypothesis is correct then we would indeed be faced with 
a new kind of influence that goes beyond that of contact (see section 3.2 for a 
detailed analysis of this possibility). 

There are other positions to consider in weighing up the relative influence of 
contact and its opposites in SSE. One concerns the influence of variation from within 
the superstrate itself. Most World English studies set on determining the features 
of a variety operate within a contrastive paradigm in which the local variety of 
L2 (second language) English is compared to the standard form of the superstrate. 
But given that contact in Africa goes back to the sixteenth century, we should be 
alert to features that are historical retentions that have since been lost in the 
standard varieties of English. Some such lexical examples include trinket ‘item of 
jewelry’ (not necessarily a cheap one); station ‘place of abode’, and not so? (Davy 
2000; Mesthrie 2003). As far as syntax is concerned the form can be able which is 
widespread in parts of Africa (as an apparent equivalent of the auxiliary can) could 
possibly go back to Elizabethan English. Visser (1963-73, III: 1738) gives exam- 
ples of such usage from Thomas More, Shakespeare, Congreve, and Dryden etc. 
The example from Dryden’s Satires (1693) provided by Visser is: That favour is 
sufficient to bind any grateful man, ... to all the future service, which one of my mean 
condition can ever be able to perform. As far as phonology goes, Simo-Bobda (2003) 
provides a long list of possible phonetic links between SSE and Anglo-Saxon 
varieties that are masked if one were to use the modern standards (or prestige 
varieties — RP or “General American”) as the yardstick. Some of his examples are 
convincing: the monophthongal realizations of the FACE and GOAT diphthongs 
may well be linked to the older forms of these diphthongs ([e] and [o]) in 
Northern English, the Celtic Englishes, and the Caribbean. Likewise, Harris 
(1996) describes the possible dialect input to the realization of the STRUT vowel 
as [9] in West Africa. Other examples of parallels that Simo-Bobda (2003) cites 
(e.g. -vocalization, fronted variants of NURSE) are less compelling, though his over- 
all position that these parallels should not be entirely ignored is a valid one. 

The final position is a comparative one that considers SSE in relation to L2 
varieties of English across the globe. Platt, Weber, and Ho (1984) identified recur- 
rent similarities in the grammatical and other features of the following countries: 
India, Singapore and Malaysia, Sri Lanka, and East and West Africa. Some of 
these features (statives as progressives, resumptive pronouns, occasional gender 
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neutralization of pronouns, and a high frequency of left dislocation and topicaliza- 
tion) are discussed and exemplified in section 4.1. Although Platt, Weber, and Ho 
(1984) excluded South Africa on the grounds that it had a considerable body of L1 
speakers of English, the L2 English of many Southern African countries, including 
South Africa do exhibit the features they documented. Such an approach that stresses 
similarities in L2 Englishes world-wide diminishes the importance of the substrates, 
since it would be difficult to maintain that a large number of substrates belong- 
ing to different families and types of languages all produced the same effects. 
Williams (1987) drew these threads together in a psycholinguistic approach that 
looked at processing properties like maximizing salience, avoiding redundancy 
within clauses, making cross-clausal relations more explicit by double marking, 
etc. Today the terms “Angloversals” (Mair 2003; Sand 2004), “New Englishisms” 
(Simo Bobda 2000) and “Universals of New Englishes” (Szmrecsanyi & Kortmann, 
2009) can be found for the persistence of such properties in L2 Englishes, despite 
their infrequency or absence in more mainstream standard forms. 

Finally, although some scholars have provided accounts that attempt to minimize 
the differences between pidgin-creoles and L2 varieties (Mufwene 2001; DeGraff 
2003), the data from Africa and Asia suggests that there is a typological disjunc- 
tion between the two types (Mesthrie 2004a; Szmrecsanyi & Kortmann, forthcoming). 
Within Africa the difference between pidgin/creole varieties like West African 
Pidgin English on the one hand and SSE on the other is vast. For reasons of space 
these pidgin/creole varieties will not be discussed in this chapter. 


3 Effects of Contact in SSE Phonology 


Substrate influence is clearly discernible in SSE phonetics and phonology. Among 
the relevant salient phonetic/phonological tendencies in Bantu languages are: 


1 Five- or seven-vowel systems in which length is not distinctive and diphthongs 
absent (Clements 2000: 135) 

2 Rarity of vowel reduction to schwa 

3 Rich tonal rather than stress systems 

4 A tendency to CV syllable structure, or NCV where N is a nasal! 


In this section I will concentrate on characteristic (1). 


3.1 Vowel systems in SSE 


Most of the varieties of SSE, especially those spoken by people with lower levels 
of formal education and fewer contacts with native speakers, display a five-vowel 
system, with few or no diphthongs. Whilst there is internal differentiation 
according to the trichotomy East Africa, West Africa, and Southern Africa, the 
very existence of this broad typology argues for substrate influence. African 
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KIT FOOT KIT FOOT 


DRESS LOT DRESS LOT/STRUT 


TRAP/STRUT TRAP 
Typel Type 2 


Figure 25.1 Short-vowel system 


languages tend not to differentiate vowels according to length, tend to have 
few instances of vowel reduction to schwa, and tend to avoid diphthongs. There 
are two subtypes of the five-vowel system, depending on particular mergers. 
Figure 25.1 shows the system in terms of the subset of short vowels in Wells’ (1982) 
lexical sets. 

Type 1, with merger of TRAP and STRUT is found in East and Southern Africa 
(with one difference that I discuss below). Simo Bobda (2001) surveys varieties 
from the following countries: Zimbabwe, Zambia, and Malawi (Southern Africa); 
and Tanzania, Uganda, and Kenya (East Africa) showing this basic pattern. Type 
2 with merger of LOT and STRUT is found in West Africa (Southern Nigeria and 
Ghana) and Cameroon, the neighboring country from Central Africa. The vowel 
systems in northern Nigerian English are different (Gut 2004), relying more upon 
the vowel systems of languages like Hausa, an Afro-Asiatic language. Liberian 
Settler English, spoken largely by descendants of repatriated slaves of the nine- 
teenth century (Singler 2004), is an offshoot of Black English of the US and is there- 
fore different from other varieties of English in Africa. Figures 25.1 shows only 
the short-vowel reflexes of Wells’ lexical sets. There is inevitably a certain amount 
of idealization in the sketches. Allophones of the mid-vowels do occur: e.g. [é] 
and [e] and [3] and [o] in South Africa. These might be influenced by raising rules 
of [e] and [o] before high vowels in local languages like Zulu and Xhosa. In Southern 
Africa the TRAP vowel tends to have realizations [e] or [e] in addition to [a]. The 
variant [e] occurs frequently in words like trap, matter, cat, and is usually given 
as the basic value of the set by most authors on the subject. Hundleby (1963: 68) 
notes a range of articulation points for this vowel amongst Xhosa speakers of 
English, from a lowered [e] to a point back of [a], and suggests that the former 
is a “weak” form and the latter the “strong” form. Van Rooy and Van Huyssteen 
(2000) give the basic value of TRAP as [e] with [e] and [a] as lesser alternatives. 
Van Rooy (2004) gives the basic value as [e] but cites [ze] as a significant alter- 
nant. However, it seems to me that [e] or [e] is not really as widespread as the 
[a] equivalent. I propose that the basic form is [a], with some fluctuation accord- 
ing to words (often monosyllables) and speakers. I also suggest that [ze] is a form 
more characteristic of the acrolect. Here is a list of words taken from educated 
Black South African English (henceforth BSAE) speakers speaking on national radio 
as anchor person or expert commentators: 
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(1) a. salaries [a] [salari:z] 
b. balance [a] [bala:ns] 
c. analyst [a] [anali:st] 
d. affidavit [a] [afidavi:t] 
e. adventure [a] [adve:ntfa] 
f. Andrew [a] [andru:] 
g. Natal [a] [nata:1] 
h. Maritzburg [a] [maritsbe:g] 
i. matter [a] ~ [e] [masta] ~ [me:ta] 
j. that’s [z@] ~ [e] [6zets] ~ [Gets] 
k. taxi [zx] [taeksi:] 
1. standing [zx] [staendi:y] 


In addition, a mesolectal speaker I reported on in Mesthrie (2005), showed several 
variants in his reallocation of the TRAP vowel to either [el], [e] or [a]. The following 
examples occur with [¢]: cannot, have, understand. The following had [a]: grammar, 
actually, emancipation. Attestations of [e] were rare: happy, trap. Attestations of [ze] 
were even rarer, occurring only in the word family. Such patterns of reallocation 
show a strategy of “mapping” as speakers try and relate new target language 
vowels to their L1 system. These two short preliminary case studies suggest that 
it is reasonable to consider [a] the basic value, with some alternation (possibly 
lexically governed, with monosyllabic bases) with [e] or [e], and some speakers 
starting to go beyond the five-vowel system to produce incipient [a]. However, 
there is no serious flaw in proposing the alternative that [e] is basic in the TRAP 
set, with [a] a variant. A third alternative — that TRAP is redistributed over DRESS 
and TRAP - is also possible. However, it is a less likely solution, since there 
does appear to be some phonological rationale (rather than lexical arbitrariness) 
in choosing between [e] (largely with monosyllabic bases) and [a] in most other 
instances. Perhaps we should give the casting vote to the pronunciation of the 
first vowel in the words Africa and African. Here [a] wins hands down for all BSAE 
speakers. 

As far as the other vowels of mesolectal BSAE are concerned the following trends 
emerge. Length is neutralized so that KIT and FLEECE have /i/, FOOT and GOOSE 
have /u/, and LOT and THOUGHT have /3/, with some allophonic variation in 
the last set. BATH tends to merge with /a/ (which is TRAP/STRUT in type 1 or 
just TRAP in type 2). Again there are exceptions and complications. However, [a] 
is recorded as the main variant in East Africa, Southern Africa, and much of West 
Africa (southern Nigeria, Ghana) as well as the Cameroons (Central Africa). In 
South Africa heart and hut can sound the same, taking [a]. Finally, central vowels 
are eschewed in SSE. NURSE has the reflexes [a] in East Africa, [e] or [e] in 
Southern and much of West Africa (southern Nigeria and Ghana) and [é] or [5] 
in the Cameroons (Simo Bobda 2004: 888). Finally, schwa is rare in SSE, usually 
being replaced by [a] or a close variant like [e] in all territories. In some instances 
it takes on the full value of [i] (e.g. in certain), [u] (e.g. in people) or [e] (e.g. in 
fountain), with predictable lengthening according to syllable structure (described 
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(a) 


(b) 


KIT/FLEECE FOOT/GOOSE _ KIT/FLEECE FOOT/GOOSE 
DRESS/NURSE LOT DRESS LOT 
TRAP/STRUT/BATH TRAP/STRUT/BATH/NURSE 

Type 1A Type 1B 
(c) (d) 
KIT/FLEECE FOOT/GOOSE _ KIT/FLEECE FOOT/GOOSE 
DRESS/NURSE LOT/STRUT DRESS LOT/STRUT/NURSE 

TRAP/BATH TRAP/BATH 
Type 2 Type 3 


Figure 25.2 Five-vowel system 


below). See also the transcriptions of the underlined vowels in words of the TRAP 
set in example (1) above. 

At this stage, taking into account vowel-length neutralization, we have the 
monophthongal systems shown in Figure 25.2a—c. Type 1A is Southern African;? 
type 1B is East African and type 2 is West African. In the Cameroons we have a 
further type (Figure 25.2d), distinguished by the treatment of NURSE, which is 
usually [5], though some words in this set take [e] (Simo Bobda 2004). 

The L1 English diphthongs are not as easily generalizable in SSE. A noticeable 
tendency is to monophthongize FACE and GOAT. These are given as [e] and [o] 
respectively in East and West Africa, the Cameroons and South Africa It would 
be interesting to see whether these are phonemically distinct as /e/ and /o/ from 
the other sets which might be conceived of as /e/ and /3/. This appears to be 
the case in Southern Nigeria (Gut 2004: 819), and for South Africa (van Rooy 2004: 
947). Finally, although vowel length is not a distinctive feature in SSE, there is a 
regular phonological rule that lengthens vowels in heavy final syllables (CVC or 
VCC) or else in the penultimate syllable (see Van Rooy 2004: 950). 

It would be difficult to find a reasonable explanation for the broad similarities in 
the vowel systems of SSE other than the effects of language contact, with subsequent 
regularizations. However, substrate influence is mediated by other factors. Take 
the vowel-length rule of BSAE as an example. The broad impetus for penultimate 
lengthening comes from languages like Zulu and Xhosa (where this is a regular 
feature, secondary to tonal effects on the syllable in question). Nonetheless, the 
rule has to be modified in the emerging L2 dialect to take into account the pres- 
ence of heavy final syllables, lacking in the substrates (see further Van Rooy 2004: 
950). Finally, I agree with Harris (1996) and Simo Bobda (2003) that not all 
influences should automatically be attributed to substrate influence. Some of 
the variation in realizations of individual lexical sets might have been due to 
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differential input. Harris (1996: 7-8) points out that even where interlingual 
phonemic identification between substrate and L2 English appears to indicate 
substrate influence “lock, stock and barrel,” the very fact that the L2 lexis comes 
almost entirely from the superstrate means that “on the basis of shared lexical 
stock it, it is still possible to establish regular phonological correspondences 
between the contact variety and its lexical donor.” He proposes that the realiza- 
tion of TRAP and STRUT as [a] and [5] respectively in West Africa and [e] or [é] 
and [a] in Southern Africa is due to such differences in superstratal values in the 
seventeenth century for West Africa and the nineteenth century for Southern Africa. 
This is a plausible argument, except that the Southern African realizations of TRAP 
are (as pointed out above) more complex than an examination of just the (mono- 
syllabic) token trap indicates. A large number of tokens in the set have [a] which 
is more likely to favor the substrate pattern, rather than the shape of the super- 
strate in nineteenth-century South Africa (showing raising of the TRAP vowel). 


3.2 Possible roles for spelling pronunciations 


The final topic in my exploration of language contact phonology and its opposites 
is a consideration of what I term the “spelling form hypothesis.” A number of 
writers have tacitly assumed and sometimes openly asserted that spelling pro- 
nunciations are rampant in the variety of SSE that they describe. The position is 
that since these varieties were learnt via the education system, they show the 
influence of the written medium more than other instances of L2 acquisition involv- 
ing interactional contexts with at least some mother tongue speakers. This is an 
important position to inspect since it poses a major challenge to contact and 
historical linguistics: that substrate and superstrate are subject to a “third force” of 
reading in L2 formation under macro-acquisition. I tackle this issue in Mesthrie 
(2005), where I agree that some spelling forms are inevitable for words that one sees 
in print but seldom ever hears pronounced: rare names of people, places, and things, 
for example. But there are many unwarranted assumptions in generalizing this 
influence to all or even a large section of the vocabulary of an individual L2 
variety. The unwarranted assumptions are as follows: 


1 Literacy is widespread. 

2 L2 speakers learn the spelling of words before they formulate (mental) 
phonological classes of words. 

3 Spelling is easy, pronunciation is difficult. 

4 English letters “have” sounds, about which there is consensus amongst all L2 
speakers. 

5 People access orthographic forms as they speak. 

6 2 diachrony is the same as L2 synchronic processing. 

7 Spelling pronunciations are random. 


In fairness to authors on SSE, their observations are often made in passing, when 
attempting to give a descriptive picture of stress and vowel realizations, rather 
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than seriously proposing a theory of acquisition of individual sociolects. One might, 
more appropriately, distinguish between a strong form of the hypothesis, holding 
that spellings are of considerable significance in L2 phonology, and a weaker form, 
specifying that under certain conditions, some words show a spelling pronunciation 
in the sense that their orthographic representation influences their phonological 
form in a particular L2. 

Assumption 1 is patently false for much of South Africa, for example, where 
the functional illiteracy rate amongst adults was 29 percent in 2001.* In discounting 
assumption 2, it should be stressed that the “first learners” of BSAE did not learn 
the target language (mainly) via the written mode: there had to be teachers using 
spoken English (including Teacher Talk). For BSAE the first generation of teach- 
ers were mainly missionaries from a variety of European backgrounds (Magura 
1984; Mesthrie 1996). Their pupils would have tried to imitate the pronunciations 
of their teachers as best they could, using their own L1 phonology as a rough 
guide to the system they were learning. If, for example, their native phonology 
had no schwa then they would find its nearest equivalent, using a slow but 
reasonably systematic “mapping procedure.” The nearest equivalent settled upon 
appears to be [e], a close variant of /a/ as a default, pending other principles 
(discussed below). Assumption 3 is counteracted by anecdotes of experiences of 
students I interviewed, some of whom claimed that they had few problems 
learning the pronunciation of words, but had to work especially hard to learn the 
spelling forms of an erratic target language. An interview with one speaker even 
suggested that in his case, rather than L2 spelling generally driving the phonology 
of the L2, it may well be that L1 phonology can be used to help learn L2 spelling. 
Examples from university students, even at postgraduate level, still reflect the lack 
of vowel-length distinctions: beeches for “bitches” (in an essay on slang); whizzing 
for “wheezing” (in a dissertation on health); feast for “fist” (in an essay on sport) 
etc. These “pronunciation spellings” as I term them call into question the assumed 
direction of influence in 3. 

Assumption 4 is contradicted by the fact that different varieties have come up 
with different realizations of the same spelling forms (as in the account of the 
vowels above). The question one would want to explore is why are there (alleged) 
spelling pronunciations for, say, unstressed vowels in BSAE, but no spelling pro- 
nunciations for, say, instances of postvocalic /r/. Clearly the influence of the L1 
is a significant factor in the nonrhotic nature of the L2 variety being developed. 

Assumptions 5 and 6 fail to draw a distinction between language description 
and psycholinguistic processing. That is, for L2 studies, the phonological forms 
have to be analyzed as part of an interlocking system, irrespective of whether they 
originate from imitation of target-language speakers, L1 transfer, overgeneralization 
of rules, analogical formation, or the occasional spelling pronunciation. Such 
properties are of immense historical interest, but should not be assumed to be 
“active” in the synchronic phonological processing of the L2 variety being studied. 

Assumption 7 is related to assumption 4, for which I propose that spelling 
pronunciations in second language acquisition are not unconstrained. Spellings 
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might suggest certain pronunciations, but these pronunciations are realized in terms 
of permissible patterns of the L2 being developed and the L1. Thus if L2 learners 
of English come across a new word in print, they will not give it a spelling pro- 
nunciation if the forms suggested by the spelling are phonologically unaccept- 
able in their L1 and in their L2. They have no option but to rephonologize it as 
best they can. 

In Mesthrie (2005), on the basis of analyses of three speakers of Black South 
African English, I propose a developmental profile that shows the initial import- 
ance of contact, and the gradual restructuring away from a basic contact variety. 
Speakers initially have a five-vowel system with some diphthongs. In the basilect 
itself some minor variations occur with schwa, [ee] and [a] occurring incipiently. 
In the mesolect these three develop further (i.e. are more frequently used), whilst 
still remaining variants within the 5-vowel system. Finally, the acrolect fleshes out 
the peripheral vowels, with [a] and [ze] occurring frequently, though not invari- 
antly. Furthermore, /a/ occurs regularly in the acrolect as [a] medially and [a] 
initially. Even in the acrolect [3:] is absent; and target-language schwa corresponds 
to [e] in exposed position, and to the unstressed variants /i/, /u/, /e/ depending 
on the surrounding segments, or by convention with certain suffixes. This develop- 
mental study gives only minor support for the notion of spelling pronunciation 
being a significant part of the dialect. 


4 Contact and SSE Syntax 


I argue that substrate influence is less clear in syntax than in phonology, though 
some such influence doubtlessly exists. A useful starting point for our discussion 
of the syntax of English in Africa is the collection of articles in the Handbook of 
Varieties of English (Kortmann et al. 2004), which contains descriptions of English 
in South Africa, Nigeria, the Cameroons, Ghana, and East Africa (i.e. Tanzania, 
Kenya, and Uganda). 


4.1 Four syntactic features of SSE 


In this section I describe four features that are found in all five territories covered 
in Kortmann et al. (2004), with exemplification and offer an assessment of the extent 
to which they can be said to be outcomes of contact. (This does not exhaust the 
store of recurrent syntactic similarities, for which see Mesthrie 2004a). 


Use of resumptive pronouns 
As with many varieties worldwide (including informal L1 English) resumptive 
pronouns surface in relative clauses, where Standard English uses a “gap”: 


(2) a. There is our glue which we are getting them near. (East Africa —- Schmied 
2004: 934) 
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b. The other teacher that we were teaching English with her went away 
(Cameroon — Mbangwana 2004: 906). 

c. The old woman who I gave her the money [...] (Ghana — Huber and 
Dako 2004: 858) 


Resumptive pronouns have not been reported in subject position in any variety 
of SSE. Sentences like (3) below are unattested in a wide variety of sources 
consulted: 


(3) ? This is a man that he loves dogs. 


The occurrence of resumptive pronouns in SSE illustrates the difficulties in decid- 
ing between contact and other factors in L2 dialect emergence. As resumptive pro- 
nouns are abundant in Bantu languages there is indeed a case for substrate influence. 
However, general L2 processing might well account for the phenomenon, since 
the use of a pronoun is a more transparent form of syntax over a gap. In Zulu 
(an example of an influential substrate for BSAE), relative clauses do not allow a 
“gap”. They have to be filled by a preposition plus resumptive pronoun like 
na-ye ‘with him/her’, wa-khe ‘of him/her’, and nga-lo ‘by means of this’ (in non- 
subject and direct object functions). The direct object and subject NPs in the rela- 
tive clause take the usual subject or object concords. Although these concord markers 
are less salient than the free pronouns exemplified above, they do exemplify the 
absence of a gap (and can be considered the equivalents of bound pronouns). Subject 
concords occur in all clauses and cannot be dropped (e.g. umama ucula iculo ‘mother 
sings a song’ — the first u is the noun class marker, the second u is its subject con- 
cord). Object concords occur mainly for focusing effects, with umama uyicula iculo 
translating into something like ‘mother sings it, the song’). The object concord yi 
in bold is dropped in sentences unmarked for focus. It is not possible to use a 
free pronoun with subject relative NPs (mother who-SC sings but not *mother who- 
SC she sings’ (where SC is the subject concord u; and ‘she’ the free pronoun yena). 
The same consideration holds for object case relative NPs. 

Substrate influence thus explains why resumptive pronouns are common in BSAE 
in oblique cases other than direct object. It also suggests why they do not occur 
in subject position. However, they do not cover the direct object case, since BSAE 
allows a resumptive pronoun here (though infrequently), while the substrate has 
no free pronoun in this function. It might be countered that the object concord 
marker is a sort of equivalent of a bound pronoun: however its occurrence is 
for special focusing, rather than in unmarked sentences. So in coming to a full 
understanding of substrate influences, pragmatic functions in both substrate and 
L2 variety have to factored in. 

The psycholinguistic processing model appears to be relevant here, since the 
hierarchy posited by Keenan and Comrie (1977) suggests that it is a typological 
universal that resumptive pronouns appear with the oblique cases (indirect 
object > genitive > direct object) rather than the easiest position of subject. 
(Sentences 2b and 2c above exemplify resumptives with indirect object and 
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sentence 2a with direct objects.) The avoidance of resumptives might also be related 
to another internal factor in SSE: the predilection for left dislocation. It is this that 
we turn to next. 


Left dislocation 

This construction isolates a “topic” and follows it up by a “comment,” which in 
the grammatical sense is a full sentence, containing a copy or appositional pro- 
noun relating to the topic NP. In sentence (4a) below the topic and comment are 
separated by a hyphen, and the copy pronoun highlighted: 


(4) a. The students — they are demonstrating again. (Nigeria - Alo & Mesthrie 
2004: 823) 
b. That woman - she cheated me (Ghana — Huber & Dako 2004: 862) 
c. The people - they got nothing to eat. (South Africa — Mesthrie 1997: 127) 


Such copy pronouns typically occur in subject positions in the comment, but other 
positions on the noun phrase hierarchy are also possible, as in (5): 


(5) Q: Where did you learn Tswana? 
A: Tswana, I learnt it in Pretoria. (South Africa - Mesthrie 1997: 127) 


As with resumptive pronouns, several difficulties surround any one explanation 
for the occurrence of left dislocation. A superstratist might aver that this is simply 
a continuation of a construction common in informal English that carries pragmatic 
effects of contrast, itemizing NPs from a list or reintroducing given information 
after a stretch of discourse (see Prince 1981; Finegan & Besnier 1989). However, 
my experience of listening to decades of English suggest that some varieties use 
the strategy far more frequently than others: notably Black South African English 
and Indian South African English (Mesthrie 1992; 1997). In some instances BSAE 
comes close to grammaticalizing the use of copy pronouns, especially with noun 
phrases like the people or with complex noun phrases (or both as in (6) below). 


(6) The people who are essentially born in Soweto — they can speak Tsotsi. (South 
Africa — Mesthrie 1997: 132). 


The fact that complex noun phrases induce copy pronouns might suggest a pro- 
cessing explanation: copy pronouns help keep track of the salient topic noun phrase 
better than an S-V-—O syntax might, especially if the subject is a complex NP. 
A third explanation does involve contact. When a Bantu language like Zulu needs 
to express special effects like focusing or contrast, a free pronoun form comes into 
play. An unmarked declarative sentence like ‘the cattle died’ is Izi-nkomo za-fa, 
with subject concord marker zi (assimilated with the past tense vowel to form za) 
and no absolute (or free) pronoun. In a (marked) contrastive clause like the fol- 
lowing, such a pronoun does surface (Taljaard & Bosch 1988: 78): 
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(7) Izi-mvu — za-sinda, kodwa izi-nkomo zona za-fa. 
cl10-sheep SC.PAST-survive but __cl10-cattle they SC-PAST-die. 
‘The sheep survived, but the cattle died.’ 


The BSAE equivalent sentences are thus likely to be substrate-induced, since the 
form in Zulu is indeed a pronoun and not — say — a focusing particle. However, 
the pronoun form is at a pinch a possibility in equivalent sentences in the super- 
strate too. Although one does not generally say “The sheep survived but the cattle 
— they died,” it is not wildly ungrammatical to do so. With special intonation and 
pause structure within a certain discourse context, it is a possibility. 


Stative and habitual forms of be + -ing 

In most New Englishes the distinction between present progressive forms and 
stative/habitual forms is a lot more fluid than in standard English. Sentences (8a) 
and (8b) exhibit statives with -ing while (8c) show an habitual with -ing: 


(8) a. ...it produces a lot of smoke ... heavy smoke and it is smelling .. . (East 
Africa — Schmied: 2004: 930) 
b. The rural areas are not having access to higher education (Ghana — Huber 
& Dako 2004: 855). 
c. People who are having time for their children... (South Africa — 
Mesthrie 2004b: 963) 


Leketi Makalele notes that habitual and progressive are not usually differentiated 
in Sepedi (aka Pedi or North Sotho). This would explain why the distinction between 
habitual and progressive senses of the same verb (I work vs. Iam working) is over- 
ridden in early interlanguage. Translations of the form Ndiyasebenza (Xhosa) cover 
both possibilities. If one wishes to emphasize the ongoing nature of the activity 
then the “persistative” prefix ya would be used (‘am still working’, ‘am working 
right now’).° In English there is a somewhat rigid difference between stative and 
nonstative verbs: this time a verb belongs to one or the other category — know, 
understand, love, like, have (it) but not (generally) *am knowing, understanding, loving, 
liking, having (it). Since Xhosa does not usually differentiate between habitual and 
progressive forms, it makes no sense to expect it to differentiate formally between 
stative and nonstative in the present tense. Ndi-ya-kha ‘I am building’ and Ndi-ya- 
zi ‘I know’ have the same morphosyntax (‘I — long present — verb’). However in the 
perfect, there do appear to be differential effects, showing stative and nonstative 
to be cognitive categories in Xhosa that are grammaticalized in different ways.’ 

Once again, although there is solid support for substrate influence, it is likely 
that superstratal factors also play their role. One factor is that the superstratal 
differentiation of the categories, rather than being transparent to new learners, is 
variable and subject to change. Hear for example is a stative verb, that may allow 
be + -ing in certain contexts: What I’m hearing is that people in the room are unhappy 
is a possible sentence of standard English. Another example concerns subtle 
variants like I love this song versus I’m loving every minute of this song. Likewise, 
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the current Macdonald slogan of 2006 to the time of writing I’m loving it is slightly 
opaque to my students of L1 and L2 English background, who couldn’t see the 
subtle grammatical play (between stative and nonstative). It is not clear which of 
the different competing explanations best fit it: language contact, L2 processing, 
or variable (if not misleading) signals from the superstrate. 


Pronoun gender conflation 
The occasional conflation of he and she is reported for all the varieties studied here, 
except the Cameroons. 


(9) He is called Mary (Ghana — Huber & Dako 2004: 859) 


Huber and Dako mention a trend towards gender selection of the pronoun 
according to the head noun: 


(10) a. He was looking for her aunt (Ghana — Huber & Dako 2004: 859; std Eng: 
... his aunt) 
b. She thought his husband had travelled (Ghana — Huber & Dako 2004: 
859; std Eng: ... her husband). 


There is a strong case for substrate influence here; Bantu languages do not dif- 
ferentiate sex gender with pronouns at all. They do differentiate a variety of pro- 
noun forms in concord with the class of the noun they refer to. But this does not 
apply to humans, which prototypically occur within the same noun class (1 and 
2). Thus the Xhosa pronoun equivalent to the inanimate it of English comes from 
a large set of forms determined by class affiliation of the equivalent noun: wona 
(class 3), lona (class 5), sona (class 7), yona (class 9), and bona (class 14). On the 
other hand the equivalents of third person human forms he and she (and him 
and her) are yena alone. So the case for substrate influence is strong. One case 
study (Mesthrie 1999) enables us to evaluate how this influence operates from a 
developmental perspective using the elementary English of Xhosa working-class 
speakers in Cape Town. In early interlanguage data he is the basic form for humans 
(male or female), if is always used for nonhumans. This accords closely with the 
substrate. She emerges in subsequent acquisition for female humans, with some 
hypercorrection (she for males but not for nonhumans). The full pattern is even- 
tually acquired, although some “backsliding” occurs. Even university professors 
may under duress or extreme relaxation occasionally use he for she or vice versa. 
All of this shows a significant influence of the substrate Xhosa grammar. The 
pattern identified for early English interlanguage of Xhosa speakers does not appear 
to be the same as for children’s L1 acquisition of English, where no gender vari- 
ation has been reported (see Huxley 1970: 151). The absence of total substrate 
influence even in early interlanguage can be seen in realizations of number. Speakers 
in Mesthrie’s (1999) database occasionally conflated singular and plural (he for they), 
in sharp contrast to Xhosa grammar which categorically differentiates pronouns 
by number. 
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To end this subsection, the general point can be made that SSE syntax shows 
the mutual influences of substrate, superstrate, and processing universals. 


4.2. Anti-deletions 


Given the search in modern linguistics for general principles (and parameters) of 
syntactic organization, the effects of contact should ideally be meaningfully inte- 
grated within an unfolding grammar. One of the weaknesses of current approaches 
toward SSE is that the features of the variety are described in piecemeal lists that 
sometimes do not distinguish adequately between syntax, morphology, discourse, 
and even lexis sometimes. In this section I propose an approach that integrates 
a large proportion of the syntactic features of BSAE with a general syntactic prop- 
erty that I term “anti-deletion” (based on Mesthrie 2006). In this formulation, the 
syntax of mesolectal BSAE accords largely with that of L1 English in respect of 
underlying patterns and structures. However, it differs on the surface in dis- 
favoring empty nodes on trees. The first property that I identify is an “undeletion” 
— a tendency to restore elements that by almost all syntactic accounts involve a 
gap or deletion in standard English. Here are some brief examples with discussion: 
further details can be found in Mesthrie (2006). The examples were selected from 
mesolectal speakers. 

(11) a. Come what may come. (‘Come what may W.’) 

b. He made me to do it. (“He made me © do it.’) 

c. The fact has made me to conclude that my idea is sound. (’... made me 
© conclude ...’) 

So she was warning us that, “You'd better learn this language because, 
like, you’re going to Cape Town.” (that for @) 

They'll just tell you that, “We have been using Fanakalo.” (that for O) 
As you know that they are from the Ciskei. (that for @) 

As I made it clear before, I am going to talk about solutions, not prob- 
lems. (‘As I made © clear .. .’) 

As it is the case elsewhere in Africa, much can still be done for 
children. (‘As © is the case ...’) 


2. 


sa me 


These examples show that in common with other SSE varieties BSAE speakers 

disfavor the deletion of elements like infinitive to (11b), dummy it after verbs in 

constructions like make clear (11g), dummy it before the verb be (11h), comple- 

mentizer that before direct speech (11d) and (11f), and adjunct that after as you 

know, as I said etc. (11f). Resumptive pronouns in relative clauses also qualify as 

undeletions but are not exemplified here, as they have been discussed in 4.1 above. 
Because of these properties I suggested the following principle: 


Principle 1: If a grammatical feature can be deleted in standard English, it can 
be undeleted in BSAE mesolect. 
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Reference to standard English in the formulation acknowledges its role as the form 
of the superstrate aimed at (but not always achieved) in “macro-acquisition.” A 
second category concerns nondeletions in the standard dialect: i.e., places where 
the grammar of English dialects seems to readily permit deletion, but not the 
standard language. This category includes well-known features like pro-drop, 
copula deletion, ellipsis, gapping, and lack of do-support. As none of these ten- 
dencies exist in BSAE, they would not ordinarily be thought of as features of the 
variety. Yet they fit the anti-deletions tendency very well, and suggest a second 
principle: 


Principle 2: If a grammatical feature can’t be deleted in standard English, it almost 
always can’t be deleted in BSAE mesolect. 


The last property I describe is insertion (the opposite of deletion). Looking at the 
remaining features described by linguists working on BSAE (Gough 1996; De Klerk 
2005; Makalela 2004), I found very few instances of deletion or permutation of 
elements at the syntactic and morphological level. By contrast, a large majority 
of features involved the insertion of a morpheme, where standard English has 
none. This pertains to constructions like the following: 


1 the double marking of clauses, well known in the New Englishes of Asia and 
Africa (sentence 12a); 

2 the use of can be able for ‘can’ (sentence 12b); 

3 the frequent use of that one for anaphoric that (sentence 12c); 

4 the existence of occasional double conjunctions like supposing if ‘if’; unless if 
‘unless’; because why ‘because’; double comparatives like more better; and occa- 
sional double negatives. 

5 the presence of “underlying” prepositions with verbs like mention (about), dis- 
cuss (about), voice (out). 


(12) a. Although I’m not that shy, but it’s hard for me to make friends. 
b. ...how am I going to construct a good sentence so as this person can 
be able to hear me clearly... 
c. Q.: Do you know more Xhosa or Zulu? 
A.: That one I can’t tell you. 


These insertions lead to a last principle: 


Principle 3: If X is a grammatical feature of BSAE mesolect that is not covered 
by Principles 1 and 2, then X almost always involves the presence of a mor- 
pheme not found in standard English. 


I show further in Mesthrie (2006) that none of these principles characterize the 
other major English varieties of South Africa, and that of all the constructions men- 
tioned in this section only two occur in another variety. These are left dislocation 
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and occasional resumptive pronoun use in South African Indian English, but not 
to any great extent in any variety of the country. I proposed that 


‘ 


The “anti-deletion” concept and the more specific “undeletion” will, I believe, 
help characterise other New Englishes too, especially other varieties in Africa... 
Mesolectal BSAE shows a pendulum swing between L1 tendencies that favour 
undeleting and the influence of settings required by the standard form of the TL. 
(Mesthrie 2006: 142) 


In terms of the investigation of language contact, anti-deletions can be attributed 
to the nature of the substrata, in which deletion and movement rules are rare (see 
van der Spuy 1997 for Zulu; du Plessis & Visser 1992 for Xhosa). However, a closer 
comparison has still to be undertaken. Andrew van der Spuy, a specialist in Zulu 
syntax, is of the opinion (in work in progress) that many — but not all — of the 
properties would be supported by a contrastive analysis with Zulu. 


5 Conclusion 


African Englishes are clearly very important varieties in characterizing language 
contact. They have stabilized in situations where native speakers have become 
increasingly rare (except in South Africa), and remain the most important media 
in pan-African communication (notwithstanding the importance of Swahili, 
Arabic, French, and Portuguese). In this chapter I have tried to characterize the 
recurrent similarities in SSE, and examine the relation of substrates, superstrates, 
and processing universals. As Mufwene (1986) argued convincingly for another 
set of varieties (pidgins and creoles), all three spheres of linguistic influence play 
a complementary role in molding a subset of New Englishes of the world. 


NOTES 


1 Clements (2000: 140) goes further in defining the structure of prefixes as (C)V; roots as 
CV(N)C, derivational elements as -VC- and the obligatory final vowel suffix -V. 

2 The data was collected over a three-day period in June 2008 from the early morning 
news program. 

3 Subject to the discussion above that monophthongal tokens (mainly) of the TRAP set 
often take [e] or [e]. 

4 And probably higher for speakers of English as an L2. 

5 In this section I use examples from either Zulu or Xhosa — two closely related and 
mutually intelligible Bantu languages of the Nguni subgroup, which are the two most 
numerous languages of the country in terms of native speakers. 

6 The usual term in Bantu linguistics is persistive — I prefer the additional morpheme in 
line with adjectives like resultative and consultative. 

7 Briefly, stative verbs with perfective suffix -ile that correspond to English be + adjec- 
tive (e.g. ‘be hungry’) have present, non-perfective meaning; in contrast to non-stative 
verbs which have the expected perfective meaning. 
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26 Contact and the Celtic 
Languages 


JOSEPH F. ESKA 


The Celtic languages, once spoken across much of central and western Europe 
and even into Asia Minor, display evidence of contact with many languages from 
their first attestation in the sixth century BCE. The nontypical features of the Insular 
Celtic languages vis-a-vis other Indo-European languages, perhaps most famously 
basic verb-initial clausal configuration, as attested in Irish' and in the earliest records 
of the Brittonic languages (Welsh, Cornish, and Breton), conjugated prepositions, 
and polypersonal verbs with code agreement for both subject and object, have 
occasioned a considerable body of scholarship on the question of whether they 
are to be attributed to prehistoric contact with a substratal language and, if so, 
what this unattested language must have been like. As minority languages in recent 
centuries, the contemporary Celtic languages show the continuing effects of con- 
tact with English in Ireland and Britain and French in Brittany. This chapter will 
summarize the evidence for contact in the early history of the Celtic languages 
and then will focus upon the argument for contact in the prehistory of the Insular 
Celtic languages. 


1 Contact with Known Languages 


1.1 Continental Europe 


The ancient Celtic languages of continental Europe are known to have been spoken 
in the Iberian Peninsula (Hispano-Celtic), France (Transalpine Celtic), northern 
Italy and Switzerland (Cisalpine Celtic), and Asia Minor (Galatian). There is also 
evidence of Celtic languages having been spoken in eastern Europe and the Balkans 
(Eastern Celtic and Noric). All of these languages are only fragmentarily attested, 
but in sufficient quantity to know that their linguistic structures were similar to 
those of other ancient Indo-European languages such as Latin and Greek. 

Since the Continental Celtic languages are only fragmentarily preserved, we can 
get no more than a partial idea of the extent of the contact phenomena in which 
they participated. There is considerable evidence for lexical borrowing, especially, 
though not exclusively, with Latin, in both directions. Schmidt (1983) studies 
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language contact in Roman Gaul in general terms. The evidence for borrowing 
into the Continental Celtic languages, of course, is minimized by the size of the 
attested corpus, though we do find lexical items such as Cisalpine Celtic lekatos 
< Latin legatus ‘ambassador’ and Transalpine Celtic zpa1twp < Latin praetor ‘the 
name of a magistrate’.* A number of scholars have especially studied the evidence 
for Celtic lexical borrowings into Latin (and Greek), e.g. Schmidt (1967), Gernia 
(1981), and André (1985),? and many others have focused upon Celtic substrate 
words in the Romance languages, e.g. Corominas (1956; 1976) for Spanish, 
Lambert (2003: 187-203) for French, and Campanile (1983) for Italian. 

Evidence for borrowing or interference in the areas of phonology, morphology, 
and syntax is much more difficult to identify, especially in the case of possible 
contact with Latin, because the linguistic features of the Celtic languages of ancient 
France and Italy were often very similar to those of Latin.* There clearly must have 
been a protracted period of bilingualism (Adams 2003: 184-200) as evidenced, 
for example, by the thoroughly bilingual, complete with code-switching, account- 
ing records kept by the potters at the site of the ceramic factory at La Graufesenque 
in southern France (Adams 2003: 687-724). In some instances, Transalpine Celtic 
inscriptions seem to provide evidence for the late development of a mixed language 
in which elements of Latin and Greek are found (e.g. Meid 1980; Droge 1989). 


1.2. Ireland and Britain 


As on the continent, there is a lot of evidence for lexical borrowing, especially from 
Latin into the Celtic languages.’ Linguistic contact in Roman Britain is examined 
in detail by Evans (1983). The phonology of the Latin borrowings has been studied 
in detail for the Celtic languages, in general by Pedersen (1909: 189-242), for Old 
Irish by McManus (1983) and Uhlich (2004), and for the medieval Brittonic lan- 
guages by Jackson (1953). 

Evidence for borrowing or interference in the areas of phonology, morphology, 
and syntax is rather thinner on the ground. Some examples include preaspiration 
of plosives in Scottish Gaelic, e.g., cat ‘cat’ [k"a"t], which is usually taken to be 
the result of contact with Old Norse (Marstrander 1932: 298); the pluperfect tense 
in the Brittonic languages, formed by the affixation of imperfect desinences to the 
preterite stem, which has been argued to be modeled on the Latin pluperfect 
(Mac Cana 1976); cf. Middle Welsh first person singular carass-wn to Latin amav- 
eram ‘I had loved’; and compound tenses of Breton, which are formed with an 
auxiliary verb and the past participle, which clearly are based on French (Hemon 
1975: 245-6), e.g.:° 


(1) a. French: 


Vous — avez JAIL ess 

2.PL have.2.PL.PRES make.PST-PTCPL 
b. Breton: 

Chui och’eus great... 


2.PL have.2.PL.PRES make.PST.PTCPL 
‘You have made...’ 
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2 Contact in the Prehistory of the Insular 
Celtic Languages 


2.1 The phonological approach 


Archeological data demonstrate that Ireland and Britain had been settled by humans 
for many millennia before speakers of Celtic languages could have arrived there. 
There is no attested documentation of these languages, but some scholars have 
attempted to identify evidence for their survival into the first millennium CE via 
the examination of lexical items which bear phonological features which, they argue, 
cannot be Celtic, e.g. Schrijver (2000; 2005) and Mac Eoin (2007). Similarly, Hamp 
(e.g. 1990; 1991) has sought to identify the vowel inventory of a pre-Indo-European 
language with which various Indo-European languages were in prehistoric contact 
in northern Europe. 


2.2 More comprehensive approaches 


Far more attention, however, has been devoted by scholars to identifying a 
substratal language to which many of the non-Indo-European-looking features 
of the Insular Celtic languages can be attributed via contact. Hewitt (2007) 
delineates 37 such features which have been proposed as diagnostic of con- 
tact between proto-Insular Celtic and a language of Afro-Asiatic type, among 
which are: 


Conjugated prepositions 

VSO clausal configuration 

Invariant relative particle 

Copying as a relativization strategy 

Special relative form of the finite verb 

Polypersonal verbs 

Infixing/suffixing alternation of personal object affixes 
Position of the article in genitive embeddings 

Lack of subject agreement with full plural NPs 

Verbal noun rather than infinitive 

Predicative particle identical to a local preposition 
Prepositional periphrastic tenses 

Periphrastic tenses formed with ‘do’ + verbal noun 
Subordinating use of ‘and’ 

Verbal noun used instead of finite verb in main clause 
Syntactically governed word-initial phonological change 
Idiomatic genitive kinship constructions 

Amplification of negative by a noun after the verb 
Numerals followed by a singular noun 

Prepositional expression of ‘have’ 
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2.3. Principal advocates of the substratum theory and 
their views 


Although earlier scholars had noted parallels between the grammars of Welsh 
and Hebrew, Morris Jones (1900) is the first to focus on a number of syntactic 
parallels between Welsh and Egyptian (and Berber to some extent). He does not 
propose that an Afro-Asiatic substratum was necessary to explain the Insular Celtic 
facts, but does raise the question. 

Julius Pokorny (1926-9), on the other hand, believes firmly in the substratal 
explanation of a large number of features of the Insular Celtic languages, drawn 
not only from Afro-Asiatic languages, but also from Bantu, Basque, Caucasian, 
Finno-Ugric, and Eskimo-Aleut. 

Heinrich Wagner (1959: 152-240), following in Pokorny’s footsteps, focuses most 
of his work upon the structure of the verbal system, but posits that a language 
of Afro-Asiatic typology, but not necessarily Afro-Asiatic genetically, must have 
been spoken in Ireland and Britain and deeply influenced the evolution of the 
Insular Celtic languages. 

George Brendan Adams (1975) stresses the significant portion of the Insular 
Celtic lexicon for which Indo-European etymologies are not available, while also 
addressing some of the parallel linguistic features which have been posited as listed 
above. He also considers how archeological evidence may bear upon theories of 
an Afro-Asiatic substratal influence on Insular Celtic. 

In his unpublished 1993 doctoral dissertation (extracts of which were published 
in 2007), Orin Gensler compares the structures of Insular Celtic and Afro-Asiatic 
along with 58 other languages from around the world and identifies 17 “exotic” 
features shared by Insular Celtic and Afro-Asiatic, but which are otherwise 
uncommon cross-linguistically. He does not argue that this array of exotic 
parallels proves that the Insular Celtic languages were affected by an Afro- 
Asiatic substratum, but the implication is that there could hardly be another 
explanation. 

Karel Jongeling (2000) posits a variation of the substratum hypothesis somewhat 
similar to that of Wagner whereby an unknown language (group) influenced the 
development of both Insular Celtic and Afro-Asiatic, thus explaining the linguistic 
features that they share. Their geographic discontinuity is to be explained as a 
result of the Indo-European conquest of the European continent, which erased 
any influence of this substratum in the intervening area. 

Theo Vennemann (e.g. 2002; 2003) views the theory that an Afro-Asiatic substratum 
influenced the evolution of Insular Celtic, and, through it, English, as effectively 
proven beyond all reasonable doubt. 


2.4 Some recent critics 


The Afro-Asiatic substratum theory has never found much favor with scholars 
of the Celtic languages. Much of the criticism, however, has been of an impres- 
sionistic nature, though two recent articles are on a much higher plane. 


542 Joseph F. Eska 


Steve Hewitt (2007) reviews the scholarship in favor of finding a substratum 
behind the exotic features of Insular Celtic and evaluates 39 linguistic features 
proposed in the literature that purport to show parallels between Insular Celtic 
and Afro-Asiatic. He finds that over one third of these features are common cross- 
linguistically, that at least five are not directly comparable between Insular Celtic 
and Afro-Asiatic, that over 10 percent are only marginally attested in either 
Insular Celtic or Afro-Asiatic, and that over 25 percent are found in only part of 
Insular Celtic or part of Afro-Asiatic. If the presumed substratal effect occurred 
when the Insular Celtic languages were still a cohesive speech community, the 
last point causes one to wonder why a supposed substratal feature should 
appear only in Irish, or Welsh, or Breton, etc. 

Hewitt also calls attention, with reference to Gensler’s statistical approach, to 
the fact that proponents of the Afro-Asiatic substratum theory focus exclusively 
on shared features. No mention is made of all the features that are not shared. 
Failure to do so does not allow one to gauge the overall similarity of the two 
language families. And, again with reference to Gensler, Hewitt calls attention to 
the fact that the 17 exotic features identified as common to both Insular Celtic 
and Afro-Asiatic are actually “more consistently present” in Insular Celtic than 
in Afro-Asiatic, surely an unexpected outcome if the argument is that these 
shared features developed in Insular Celtic as the result of contact. 

Graham Isaac (2007) presents a vigorous attack against the substratum theory, 
noting, like Hewitt, that many of the proposed parallels with Afro-Asiatic are 
common cross-linguistically. He also pulls together many observations of earlier 
scholarship which argue that the exotic features of the Insular Celtic languages 
could have developed internally, without a push or even guiding force of an exter- 
nal linguistic entity. In the end, Isaac concludes not that the Afro-Asiatic substratum 
theory is “unproven or unprovable,” but that “it is simply wrong.” 


3 The Evidence of Ancient Celtic 
3.1 Linguistic features of Continental Celtic 


Virtually all of the scholarship on the question of an Afro-Asiatic substratum under- 
lying the Insular Celtic languages has failed to take note of the evidence of the 
Celtic languages of the ancient European continent. As noted in section 1.1, these 
languages, mostly attested in the Mediterranean and the Alps where the presence 
of an Afro-Asiatic substratum presumably is out of the question, resemble other 
ancient Indo-European languages in their linguistic structures. It would be very 
important to the discussion to know whether any of the exotic features of the 
Celtic languages of Ireland and Britain can be found in those of ancient con- 
tinental Europe. What we find is that, despite the fragmentary attestation of these 
languages, a number of the most striking exotic features are, indeed, present in 
some form. 
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3.2 Conjugated prepositions 


It is clear that the conjugated prepositions of Old Irish, as exemplified by Old Irish 
do ‘to’, continue prepositions with attached clitic pronouns: 


(2) lsg dom l.pl din 
28g duit 2.pl duib 
3.sg m. dé 3.pl duaib 

f.. di 
n. do 


Elsewhere in the Indo-European languages, such structures are known to occur 
in the Anatolian language Hittite, e.g. katti=tti ‘with you’. Though precisely such 
a structure is not yet attested in the Continental Celtic languages, Transalpine Celtic 
does possess the connective form du=ci, composed of the preposition ‘to’ plus the 
clitic locative singular form of the deictic pronominal *kej, thus literally ‘to here’. 
The presence of such a structure guarantees that at least the later-attested 
Continental Celtic languages had structures of the type that later evolved into the 
conjugated prepositions of the Insular Celtic languages. 


3.3 Verb-initial clausal configuration 


It is clear that the unmarked clausal configuration of the later-attested Continental 
Celtic languages, late Cisalpine Celtic and Transalpine Celtic, was SVO with pro- 
drop (Eska 2007), e.g.: 


(3) a. Martialis  Dannotali ieuru Ucuete sosin 
M.NOM.SG D.GEN.SG give.3.SG.PRET U.DAT.SG this. ACC.SG 
celicnon 
edifice. ACC.SG 
‘Martialis, son of Dannotalos, gave this edifice to Ucuetis.’ 

b. reguc cambion 


straighten.1.SG.PRES crooked.ACC.SG 
‘I straighten the bent thing.’ 


However, by a constraint known as Vendryes’ Restriction, whereby second- 
position clitic pronominals could only be hosted by the verbal complex, the verb 
is frequently drawn to clause-initial position, e.g.:’ 


(4) a. sioxt =i; albanos panna; extra 
add.3.SG.PRET 3.ACC.PL.NEUT A.NOM.SG vessel.ACC.PL beyond 
tud(don) ccc 


allotment.ACC.SG 300 
‘Albanos added 300 vessels beyond the allotment.’ 
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b. Akisios Arkatoko{k}materekos to= $0;= kote 
A.NOM.SG A.NOM.SG PV 3.ACC.SG.MASC give.3.SG.PRET 
atom; teuo-xtonion 


border.ACC.SG god-man.GEN.PL 
‘Akisios Arkatokomaterekos, he gave the boundary (stones) of gods 
and men.’ 


In view of the fact that in typical human speech, the majority of clauses con- 
tain pronominals once the subjects of a discourse have been established, many 
scholars of the Celtic languages are of the opinion that Vendryes’ Restriction 
was largely, though not exclusively, responsible for the eventual development 
of unmarked verb-initial clausal configuration in the Insular Celtic languages 
(e.g. Eska 1994). 


3.4 Special relative form of the verb 


Old Irish possessed in the first and third person plural special relative forms of the 
verb which continue a verbal form to which an uninflected clitic relative particle 
is attached, e.g. third person plural present bertae ‘who bear’ < *beronti=io. The 
only vestiges in the Brittonic languages are Middle Welsh yssyd and Middle Breton 
so/zo ‘who/which is’ < *esti=io. Forms of the identical structure as those which 
underlie the Insular Celtic forms are attested in Transalpine Celtic by dugijonti=jo 
‘who serve’ and toncsijont=jo ‘who will destine’. 


3.5  Polypersonal verbs 


Just as the personal affixes of the conjugated prepositions of the Insular Celtic 
languages continue clitic pronominals which have suffered phonological attrition, 
so also do the so-called infixed and suffixed pronouns of those languages continue 
clitic pronominals which have evolved into personal affixes which are exponents 
of object agreement. Cf. the structures of the Old Irish verbs with personal object 
agreement affix with that of the Continental Celtic verbs with personal object agree- 
ment clitic:* 


(5) Old Irish: 

a. do-s= mbeir 
PV-3.SG.OB].FEM give.3.SG.PRES 
‘S/he gives it.’ 

b. beirth-i 
bear.3.SG.PRES-3.SG.OBJ.MASC/NEUT 
‘S/he bears it.’ 

Continental Celtic: 

c. to=So= kote 
PV 3.ACC.SG.MASC give.3.SG.PRET 
‘He gave it.’ 
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d.  sioxt =1 
add.3.SG.PRET 3.ACC.PL.NEUT 
“He added them.’ 


3.6 Infixing/suffixing alternation of object markers 


The evidence presented in section 3.5 demonstrates that the infixing/suffixing alter- 
nation of object agreement markers also existed in the later-attested Continental 
Celtic languages. 


4 Some Conclusions 


The discussion in sections 2.4 and 3 seems to undermine much of the argument 
for a specifically Afro-Asiatic substratum being ultimately responsible for some 
or all of the exotic features of the Insular Celtic languages. Still, there surely were 
human languages spoken in Ireland and Britain prior to the arrival of speakers 
of Celtic language(s). Equally surely, these pre-Celtic languages likely had some 
impact on the speakers of proto-Insular Celtic. We just do not know what form 
that impact may have taken. Even though the fragmentary evidence of the 
Continental Celtic languages makes it clear that purely Celtic-internal factors are 
responsible for the development of some of the exotic features of the Insular Celtic 
languages, external linguistic contact may have helped push the evolution of these 
features forward.’ Indeed, the presence of a particular configuration of a linguistic 
structure, however incipient in Insular Celtic, but also present in a substratal 
language, may have been a necessary precondition for such influence to take 
place. As Mithun (1992: 89) has remarked, it is often difficult to tease internal and 
external forces apart: 


Causes of linguistic change have traditionally been classified into two types: inter- 
nal and external. Frequently cited internal causes of change include such factors as 
speakers’ preferences for simple and transparent systems, which can prompt learn- 
ers to remodel apparently irregular or opaque paradigms. The most commonly cited 
external cause of change is language contact. While the distinction between internal 
and external causation may be clear-cut in some cases, the separation of these 
factors is not always straightforward or even desirable, particularly in the area of 
syntactic change. 

Much syntactic development is driven by an interplay between internal and exter- 
nal factors. Grammaticization may seem to reflect a purely internal process: the cog- 
nitive routinization of patterns of expression. Yet structures are not grammaticized 
randomly. Speakers automate those structures they use the most often. Similarly, 
syntactic borrowing may seem to represent a purely externally caused development: 
it is dependent on external contact with other languages, under appropriate 
conditions of relative prestige and bilingualism. Yet aspects of the internal structure 
of the borrowing language can affect the facility with which a prospective loan is 
integrated. 
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Internal and external factors can be difficult to untangle because syntactic change 
is so often the result of their interaction. For the same reason, examining the effects 
of either in isolation can be a mistake if we are to make progress in understanding 
the causes of syntactic change. 


We seem to be left, then, in a position in which it is difficult, if at all possible, to 
draw any firm conclusions. Given the quite large number of Insular Celtic lexical 
items which, as yet, do not appear to have an Indo-European etymology, it seems 
inevitable that we must conclude that the pre-Celtic language(s) of Ireland and 
Britain had a significant impact upon proto-Insular Celtic. Whether this language 
(group) guided the evolution of the syntax of the Insular Celtic languages in 
any significant way is very hard to know. However, in view of the fact that the 
fragmentarily attested Continental Celtic languages show evidence for the devel- 
opment of some of the “exotic” features of Insular Celtic in geographical regions 
in which a substratum, especially an Afro-Asiatic one, is extremely unlikely, the 
probability of demonstrating such influence would appear to be so low as to be 
virtually nil."° 


NOTES 


a 


And its descendants, Scottish Gaelic and Manx. 

2 Tignore here the area of onomastics, in which considerable borrowing took place in 
both directions. See Stiiber (2007) for a recent analysis of personal names. 

3 Even for the meagerly attested Galatian, there is evidence for language mixing in 
the hybrid idionym Apyedvpvoc, a compound of Lycian arma- ‘moon’ and Galatian 
dumno- ‘world,’ attested in a Greek inscription of coastal Asia Minor (Freeman 2007). 

4 The only linguistic features borrowed into the ancient Celtic languages which have 
been proposed are the patronymic suffix -alo/a- of early Cisalpine Celtic, which has 
been linked to the genitive singular exponent -al of the Etruscoid language Raetic 
(Pedersen 1921), and the thematic genitive singular -1, which appears in Latin no ear- 
lier than ca. 300 BCE and in Cisalpine Celtic no earlier than ca. 200 BCE, and is likely 
an areal phenomenon shared also by Messapic and, perhaps, Venetic (Eska & Wallace 
2001: 92). 

5 See the articles in Ureland and Broderick (1991) on language contact in Britain and 
Ireland in general. 

6 Grammatical abbreviations: ACC = accusative; DAT = dative; GEN = genitive; MASC = 
masculine; NEUT = neuter; NOM = nominative; OBJ = object; PL = plural; PRES 
present; PRET = preterite; PST-PTCPL = past participal; PV = preverb; SG = singular. 

7 Note that the position of the second-position clitic pronominal informs us that Akisios 
Arkatokomaterekos in (4b) is a left-dislocated NP. 

8 Note that both of the Continental Celtic clauses in which these verbs occur, set out in 
(4), are tokens of clitic doubling, thus leaving no doubt that they mark object agreement. 

9 Such an approach is adopted by Hickey (2002a; 2002b). 

10 There are other approaches which have been advocated to explain the evolution of 
some of the exotic features of the Insular Celtic languages. Matasovié (2007), for 
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example, proposes that mutual interference among Irish, the Brittonic languages, 
and Vulgar Latin in the period 350-550 CE was the cause, citing as a comparandum 
the languages of the Balkan Sprachbund, in which some “exotic” features — from the 
perspective of other European languages — arose which cannot be attributed to sub- 


stratal languages of the region. 
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27 Spanish and Portuguese 
in Contact 


JOHN M. LIPSKI 


1 Introduction: Spanish in Contact 


There are currently some 400 million native speakers of Spanish scattered across 
every continent except Antarctica. With the exception of those in the Iberian 
Peninsula, all are the linguistic heirs of the Spanish language diaspora that began 
in 1492 — with the expulsion of Spanish-speaking (Sephardic) Jews, and with the 
arrival of the Spanish language in the Americas. Although many regional languages 
were spoken in fifteenth-century Spain (and most are still spoken even today), 
only Castilian took root in other continents, in itself a remarkable development. 
More remarkable still is the regional and social variation which characterizes 
modern Spanish, particularly in Latin America; some of the differences among 
Spanish dialects around the world are reflected in dialect divisions in contem- 
porary Spain, while others are not. In addition to being the offspring of specific 
regional and social dialects of Spain, the varieties of Spanish spoken in the dias- 
pora owe much of their subsequent diversification to contact with other languages, 
under a variety of circumstances. Among the more salient contemporary contact 
zones are the following: 


(1) a. Spanish as an official or co-official national language: 

Spain: contact with regional languages, including Basque, Catalan, 
Galician, Asturian 

Andorra: contact with Catalan, French 

Dominican Republic: contact with Haitian Creole and Jamaican Creole 
English 

Puerto Rico: contact with English, West Indian creole Englishes, Haitian 
Creole 

Cuba: vestigial contact with Haitian Creole, Jamaican Creole English 

Mexico: contact with Yucatecan Maya, Nahuatl, Mixtec, Huastec, and other 
indigenous languages; Veneto 

Guatemala: contact with several Mayan languages; Garifuna 
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Honduras: contact with US English, West Indian Creole English, Miskito, 
Garifuna, and smaller indigenous language groups 

Nicaragua: contact with Miskito Coast Creole English, Miskito, Sumo, 
Rama, and smaller indigenous language groups 

Costa Rica: contact with West Indian creole English, Bribri, and smaller 
indigenous language groups 

Panama: contact with West Indian Creole English, Cantonese, Guaymi, 
Choc6, and smaller indigenous language groups 

Colombia: contact with Palenquero (Afro-Hispanic creole), Chocd, 
Quechua, and numerous languages of the Amazon basin 

Venezuela: contact with Guajiro (Guay), West Indian Creole English, 
and with several smaller indigenous language groups 

Ecuador: contact with Quechua, Colorado, Shuar, and smaller indigenous 
language groups 

Peru: contact with Quechua, Aymara, Axinica, Chipibo, Aguaruna, and 
smaller indigenous language groups 

Bolivia: contact with Aymara, Quechua, Guarani, Portuguese, and 
smaller indigenous language groups 

Paraguay: contact with Guarani, Japanese, Plattdeutsch, Portuguese, 
and smaller indigenous language groups 

Chile: contact with Mapuche, Aymara, German 

Argentina: contact with Guarani, Portuguese, and smaller indigenous lan- 
guage groups 

Uruguay: contact with Portuguese 

Equatorial Guinea: contact with Pidgin English, French, Fang, Bubi, 
Playero languages, Annobonese creole Portuguese 


b. Spanish as a language of border contact or historical colonization: 
United States: contact with English 
Gibraltar: contact with English 
Morocco: contact with Arabic, French 
Philippines: contact with Tagalog, other Philippine languages, Chabacano 
(Philippine creole Spanish) 
Haiti: contact with Haitian Creole 
Aruba and Curacao: contact with Papiamentu (also English, Dutch) 
Trinidad: contact with West Indian English, Trinidad French Creole 
Belize: contact with Belize Creole English, Mayan languages 


c. Spanish as a language of immigrants from non-neighboring countries 
Scandinavia: contact with Swedish, Norwegian 
Australia: contact with English 
Israel: contact with Hebrew, Yiddish 


In Spanish America, Spanish came into contact principally with indigenous 
languages, but also with African languages spoken by hundreds of thousands of 
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slaves (Lipski 2005), and to a lesser extent languages of voluntary immigration 
in later centuries, mainly Italian, English, Cantonese, and Afro-European creole 
languages of the Caribbean, such as Haitian Creole, Jamaican Creole, and 
Papiamentu (Lipski 1999a; 1999b). In North America, Spanish came into contact 
with English in the United States via two principal mechanisms. The first was the 
territorial expansion of the United States to engulf Spanish-speaking regions 
originally part of another nation, specifically Mexico, and, after the Spanish- 
American War of 1898, also Puerto Rico. The second — and quantitatively much 
larger — source of Spanish speakers in the United States is the voluntary immigration 
of Spanish speakers from all parts of the Spanish-speaking world, with Mexico, 
the Caribbean, and Central America contributing the greatest numbers (Lipski 
2008b). Outside of the Iberian Peninsula and the Americas, stable communities 
where Spanish is in contact with other languages are relatively scarce. Within Europe 
the only probative case is Gibraltar, an officially English-speaking colony in 
which Spanish is the de facto dominant language (Lipski 1986; Moyer 1992). Nuclei 
of Spanish guest-workers were once numerous in northern European industrial 
cities, but these have shrunk as a consequence of Spain’s growing prosperity, and 
those Spaniards remaining in other countries no longer live in discrete speech com- 
munities. In Africa Spanish is spoken vestigially in Morocco (Ghailani 1997; 
Sayahi 2005a; 2005b; 2005c; Scipione & Sayahi 2005), and in the Western Sahara 
in contact with vernacular Arabic dialects known as hasania, although due to the long- 
standing civil war in this area (officially part of Morocco), most Spanish-speaking 
Saharauis live in refugee camps in Algeria or outside of North Africa entirely (Tarkki 
1995). In sub-Saharan Africa Spanish is the official language of Equatorial Guinea, 
and is spoken as a strong second language by most Guineans (Lipski 1985a; Quilis 
& Casado-Fresnillo 1995). In the Philippines, Spanish did not take root as a colonial 
language, although Spanish-derived creole languages are spoken in the Manila 
Bay communities of Cavite and Ternate, and in the city of Zamboanga, on the 
southern island of Mindanao. There are a few native speakers of non-creole 
Philippine Spanish, mostly upper-class mestizo families no more than two gener- 
ations removed from Spain, but there are no Hispanophone speech communities 
in the Philippines (Lipski 1987d; 1987e; 1987f). Some of the configurations that 
brought Spanish into contact with other languages have disappeared, while 
others remain to the present day, shading and molding the many reincarnations 
of Spanish around the world. 

The number and types of contact situations involving contemporary Spanish 
are so vast and far-ranging that a panoramic survey would yield little more than 
a catalog of disparate facts that shed little light on the nature of language contact. 
The first portion of the present chapter will therefore concentrate on four proto- 
typical cases in which a plausible case can be made for contact-induced variation. 
Since lexical borrowing, calquing, and semantic convergence are predictable, 
widespread, and unremarkable correlates of language contact, the following 
sections will concentrate on more revealing instances of structural change in 
contact environments. The cases will involve morphosyntax, lexico-semantics, and 
phonetics, respectively. 
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2 Spanish in Contact with Indigenous Languages: 
Clitic Doubling in Andean Spanish 


Since the beginning of the sixteenth century Spanish has come into contact with 
dozens of indigenous languages in the Americas, of which only a handful have 
resulted in stable speech communities and demonstrable imprints on Spanish that 
go beyond simple lexical borrowing. In most instances the indigenous speakers 
in question belonged to geographically large and sedentary pre-Colombian soci- 
eties with a high degree of central organization as well as imperial aspirations. 

Although lexical borrowing from indigenous languages is a mainstay of Latin 
American Spanish, demonstrable morphosyntactic influence of Native American 
languages on Spanish is rarely attested. One promising case involves the epiphe- 
nomenon of doubled direct object clitics in several regions of Latin America, with 
the common thread being current or past bilingual contact with a Native American 
language, including Nahuatl, Mayan languages, Guarani, Quechua, and Aymara. 
Although all varieties of Spanish allow and even require the combination of direct 
object clitic and a full direct object in limited circumstances, the cases to be dis- 
cussed would be ungrammatical in noncontact varieties of Spanish. In some cases 
the ungrammaticality results from lack of gender and number agreement between 
the direct object and the clitic, while in other instances the nature of the direct 
object itself precludes the simultaneous presence of an object clitic. A selection of 
examples illustrates the range of phenomena; in the following cases the direct object 
clitic would not be allowed in monolingual Spanish varieties since the direct object 
is inanimate. Moreover the clitic involved is usually the invariant lo which is 
normally associated with masculine singular objects, even when the direct object 
is feminine and/or plural: 


(2) Peru: contact with Quechua 
Le pedi que me /o, calentara Ja plancha, (Pozzi-Escot 1972: 130) 
‘I asked her to heat up the iron for me’ (direct object is feminine) 


Northwestern Argentina: contact with Quechua 
éMe Io; va a firmar la libreta;? (GOmez Lépez de Teran & Assis 1977; Rojas 
1980: 83) ‘Will you sign the book for me? (direct object is feminine) 


Bolivia: contact with Aymara 
A minutos de su llegada, Jo, cerr6 la puerta, (Mendoza 1991: 103) 
‘a few minutes after he arrived, he shut the door’ 


Ecuador: contact with Quechua 
Le; veo el carro, (Suiter & Yépez 1988) 
‘T see the car’ 


Paraguay: contact with Guarani’ 
Le, veo a ella, (Krivoshein & Corvalan 1987: 37) 
‘I see her’ (direct object is feminine) 
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Mexico: contact with Nahuatl 
lo; compramos /a harina, (Hill 1987) 
‘we buy the flour’ (direct object is feminine) 


Chiapas, Mexico: contact with Mayan languages 
Lo, arreglé la casita, (Francis Soriano 1960: 94) 
‘T cleaned the house’ (direct object is feminine) 


Yucatan, Mexico: contact with Mayan languages 
2Ya lo, anunciaste la boda,? (Suarez 1979: 180) 
‘Did you already announce the wedding?’ (direct object is feminine) 


El Salvador: contact with Pipil (a variety of Nahuatl) 
yo no lu, tengu milpa; (Baratta vol. 2: 611) 
‘I don’t have a cornfield’ (direct object is feminine) 


Leticia, Colombia: contact with Amazonian languages 
Lo, maté una danta; (Rodriguez de Montes 1981: 104) 
‘He killed a wild pig’ (direct object is feminine) 


Although it might be surmised that syntactic calquing resulting from substratum 
interference is involved in each case, closer examination reveals that no word-by- 
word translation from an indigenous language can be postulated for any mani- 
festation of clitic doubling throughout Latin American Spanish. In each of the dialect 
zones where clitic doubling occurs, however, fortuitous syntagmatic and syntactic 
coincidence with indigenous languages has propelled Spanish object clitics into 
new patterns which represent contact-induced innovations. The clearest case for 
the role of indigenous language contact in clitic doubling occurs in the Andean 
zone (the highlands of Bolivia, Peru, and Ecuador, and immediately adjacent areas 
of northeastern Chile, northwestern Argentina, and southwestern Colombia. For 
many, Andean Spanish is implicitly synonymous with Quechua substratal influ- 
ences, at times with Aymara taken into consideration. Grammatical traits labeled 
as “Andean Spanish” are more frequently confined to the speech of bilinguals for 
whom Spanish may be the recessive language, and may not be found in mono- 
lingual Spanish of the same geographical areas. In the Andean dialect zone, clitic 
doubling of the sort illustrated above straddles the sociolinguistic divide between 
indigenous interlanguage and regionally acceptable Spanish. In most of the regions 
clitic doubling with inanimate direct objects is sociolinguistically acceptable as long 
as the clitic agrees with the direct object in both gender and number, as in other 
monolingual varieties of Spanish. Non-agreeing clitics of the sort lo; veo la casa; 
‘Isee the house’ are quite stigmatized, although in some speech communities these 
combinations pass virtually unnoticed.’ 

Spanish of all regions permits a direct object NP to be replaced by a clitic, regard- 
less of the animacy of the directo object (DO); thus: 


(3) Veo a Juan/el libro ‘I see John/the book’ 
Lo veo ‘I see him/it’ 
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When the DO is a personal pronoun (i.e. [+animate]), both the clitic and the full 
pronoun may appear; indeed, for most dialects, if the postverbal pronominal DO 
occurs, a preverbal clitic must accompany it: 


(4) Lo,/*O, veo a él; ‘I see him’ 


In a subset of Spanish dialects (particularly in the Southern Cone, including 
Argentina, Uruguay, Chile, and sometimes Paraguay and Bolivia), clitic doubling 
of ([+animate]) DO nouns is also possible, and often even preferred: 


(5) Lo,/@; veo a Juan, ‘I see John’ 


As shown above, the Andean varieties of Spanish permit, and for large numbers 
of speakers actually require, clitic doubling of inanimate [+definite] DOs. Clitic 
doubling occurs only with direct objects; lo is not combined with intransitive verbs,’ 
in locative constructions, or in other combinations where no direct object is 
involved, as sometimes occurs in other Spanish interlanguage varieties. 

In order to illustrate a possible route of formation of direct object clitic doubling 
in Andean Spanish as the result of language contact, the case of Spanish—Quechua 
contacts will be analyzed in more detail. Quechua marks direct object nouns with 
the suffix -ta, or -man if following a verb of motion (Lastra 1968; Cusihuaman 1976; 
Catta 1985; Cole 1985; Galvez Astorayme 1990). This suffix is invariable, cliticizes 
to all direct object nouns whether definite or indefinite, and even attaches to ques- 
tions and relative clauses, as shown by the following (Peruvian) examples (an 
approximation in “Andean” Spanish is given in parentheses): 


(6) Tika -ta kuchu-ni 
Flower-ACC cut 1SG = ‘I cut the flower’ (Jo corto Ia flor) 


ima- ta kuchi-ni? 
What-ACC cut 1SG = ‘What do I cut?’ (jqué lo corto?) 


Challwa-ta apa -nki 
Fish-ACC carry 2SG (FUT) = ‘You will carry fish’ (Jo llevards pescado) 


Asta -ni unu -ta 
Carry 1SG water-ACC = ‘I carry water’ (lo acarreo agua) 


The accusative marker -ta does not occupy the identical syntactic position as the 
invariable lo of the corresponding Andean Spanish sentences, which would be 
roughly as indicated above. The Spanish clitic Jo occupies the immediate preverbal 
position, while in Quechua -ta attaches to the end of the direct object noun. Ina 
canonical Quechua SOV transitive sentence where the direct object immediately 
precedes the verb, -ta coincidentally comes just before the verb, i.e. in the iden- 
tical position to Spanish proclitic lo. However, it would be easy for a speaker of 
Spanish interlanguage to interpret the clitic /o, statistically the most frequent, as 
some sort of transitivity marker comparable to Quechua -ta.* It is not irrelevant 
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that Spanish /o itself marks an accusative relationship, albeit not in the fashion 
of Quechua -fa. Most contemporary syntactic analyses of Spanish “direct object” 
clitics (at least since Franco 1993) regard these elements as spell-outs of verb—object 
agreement, but once morphological agreement is suspended, the clitic Jo serves 
no other function than to mark the verb as transitive. A speaker of the developing 
indigenous interlanguage, encountering preverbal Jo only in clearly transitive 
sentences (including the possibility of clitic doubling with human DOs, as in the 
Southern Cone), would be likely to overgeneralize the need for Jo to appear in all 
transitive clauses. Since the quintessential Quechua-influenced interlanguage 
maintains an O-V word order, Spanish lo would at first be misanalyzed as a case 
marker attached to the noun, in a direct calque of Quechua -ta: 


(7) el — poncho-lo tengo 
ART poncho-ACC have (1SG) 


As interlanguage speakers develop greater fluency in Spanish, word order gravi- 
tates to the more usual V-O for nonclitic DOs. It is at this stage that Jo, now full 
integrated as an object clitic in Spanish, retains its proclitic position, resulting in 
the clitic-doubled pattern. In this instance the results of language contact have 
to be approached indirectly and circumstantially: clitic doubling with invariant 
lo occurs only where Spanish is in contact with indigenous languages, and the 
invariant clitic lo is congruent to another invariant monosyllabic element in the 
indigenous language (in this case, Quechua) in a semantically and syntactically 
congruent environment. This type of contact-induced structure is different from 
the more usual calques and word-order modifications, and highlights the need 
to move beyond simple formulas when assessing the effects of language contact. 


3 Spanish in the United States: The 
Grammaticization of pa(ra) atras 


With as many as 40 million Latinos, most of whom speak at least some Spanish, 
the United States is on the way to becoming the world’s third largest Spanish- 
speaking nation, if it has not already achieved that position. The Spanish language 
in the US is almost always in constant contact with English; many Latinos in the 
US use English more frequently than Spanish on a daily basis, almost all have 
more formal education in English than in Spanish, and an undetermined but 
certainly large segment of the Latino population is demonstrably more proficient 
in English than in Spanish. 

Within the US, loan translations are commonplace among bilingual Spanish— 
English speakers, and many are so subtle as to pass unnoticed, especially when 
they represent subtle departures from worldwide Spanish patterns: sofiar de instead 
of sofiar con ‘to dream of’ or even tomar una clase instead of seguir un curso ‘to take 
a class’. Others, like tochar ‘to touch’ and puchar ‘to push’, arise spontaneously in 
bilingual conversations, but have not become consolidated in US varieties of Spanish. 
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By far the most commonly cited — and most often criticized — apparent loan trans- 
lation found in all bilingual Spanish-English communities in the US is the use of 
para atrds (usually pronounced patrds) ‘toward back’ or ‘backwards’ as a trans- 
lation of the English verbal particle back, as in to call back, to pay back, to talk back, 
to give back. In the Spanish of various bilingual Latino groups in the US, patras 
combines with the respective Spanish verbs with the same meanings as the 
English verb + back constructions. Examples include: 


(8) Llamar patras ‘to call back’ 
Dar patras ‘to give back’ 
Venir patras ‘to come back’ 
Hablar patras ‘to talk back’ 
Pagar patras ‘to pay back’ 
Mover(se) patras ‘to move back’ 


Constructions based on patrds have been documented for all Spanish-speaking com- 
munities in the US, as well as occasionally within Puerto Rico. In particular the 
use of patrds is well known among Mexican-American/Chicano, Puerto Rican, 
Dominican, and Cuban Spanish speakers born or raised in the United States (Pérez 
Sala 1973: 67; Varela 1974; Sanchez 1983; Lipski 1985b; 1987a). It has made its way 
into popular US Latino literature, for example from the novel La vida es un special 
by the Cuban-American Roberto Fernandez (1981: 74): “Llamame pa tra cuando 
tenga un tiempito” (‘call me back when you have a moment’). The same usage 
is found among more recent Central American immigrants. To demonstrate the 
spontaneity with which patrds constructions can arise in the absence of imitation 
of more established bilingual varieties, identical constructions are found among 
the Islefios of St. Bernard Parish, Louisiana, descendants of Canary Islanders who 
arrived in the late eighteenth century and were removed from contact with other 
varieties of Spanish for more than two centuries (Lipski 1987b): 


(9) Ven pa trah mafiana ‘come back tomorrow’ 
Dio quiera que eso tiempoh nunca vengan pa tra ‘may God will that those times 
never come back’ 
Te ponian el pie pa tra ‘they put your foot back’ 
Tuve que darselo pa tra ‘I had to give it back to him’ 
Cuando se acaba Ia pehca, se va pa trah pal trabajo ‘when the fishing season is 
over, he goes back to his job’ 


Identical constructions are found in the speech of “Sabine River” or “Adaesefio” 
Spanish speakers in northwestern Louisiana and northeastern Texas, tiny enclaves 
descended from Mexican soldiers at Spanish military garrisons from the 1730s, 
and similarly isolated from other varieties of Spanish (Lipski 1987c; 1988; 1990b): 
vamos patrds ‘let’s go back’; vuelva patras ‘come back’. Patras constructions are 
also found in the Spanish of Gibraltar. Although Gibraltar is a nominally English- 
speaking crown colony of Great Britain, the vast majority of its native-born residents 
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speak Spanish as a first language, since they are descended from mixed marriages 
between Spaniards and Britons. The Spanish-English interface of Gibraltar bears 
a strong resemblance to the bilingual Hispanic communities of the US, in terms of 
specific lexical Anglicisms, and especially in the area of calques, code-switching, 
and syntactic interference (Lipski 1986; Moyer 1992). Spanish-speaking Gibraltarians 
make frequent use of constructions with patrds, in ways that exactly parallel US 
Latino usage, as indicated by the following examples collected in Gibraltar: 


(10) Vengo pa trah mafiana ‘Ym coming back tomorrow’ 
Por favor, pongalo pa trah ‘please put it back’ 
Cuando quiera, te lo doy pa trah ‘Y'll give it back whenever you want’ 


In these combinations, which are also found in Dutch and in somewhat similar 
fashion in German, back is not acting as a preposition or adverb, but rather as a 
particle associated with the verb. Obligatorily if the object following back is a pro- 
noun, and optionally if it is a full noun phrase, back follows the verb: I'll pay you 
back; give me back the box; I put the book back on the shelf. In the aforementioned 
combinations patrds normally adopts the postposed position in Spanish, unless 
the direct object is expressed via a clitic, which obligatorily occurs before the verb: 


(11) Pagué el préstamo pa tras/*Pagué pa tras el préstamo ‘I paid back the loan’ 
Di el libro pa tras/*Di pa tras el libro ‘I gave the book back’ 


However, despite the apparently clear-cut case of syntactic transference from English 
to Spanish, the expression patrds is unique in (fluent) US and Gibraltar Spanish 
as a functional equivalent of an English verbal particle; verbal combinations such 
as knock over, sit down, figure out, come through, etc. are virtually never calqued into 
Spanish, despite the fact that their Spanish equivalents are no more common nor 
morphologically less “difficult” than volver ‘to return’, regresar ‘to return’, devolver 
‘to give back’, and the like, all of which underlie patrds constructions. Combinations 
involving para atrds are fully consistent with Spanish grammatical usage, and 
do not differ structurally in any way from other idiomatic expressions found 
throughout the language, being no more “un-Spanish” than de nada, no hay de qué 
‘you're welcome,’ both of which are grammatically awkward when parsed liter- 
ally (‘of nothing’ and ‘there is not of what,’ respectively). Otheguy (1993) argues 
that patrds constructions are not due to the direct influence of English grammar, 
despite the obvious similarities, but in order to account for the fact that patras 
constructions of the sort mentioned above are found only in contact with English 
— in all stable Spanish-English contact environments — he acknowledges that the 
semantic notions conveyed by the English particle back could well have been 
carried over to Spanish. Otheguy (1993: 35) is certainly correct in asserting that 
pa(ra) atras constructions are not simple word-for-word calques from English into 
Spanish, although semantically the Spanish combinations are virtually identical 
to their English counterparts. At the same time, without the backdrop of a panoply 
of English verbal constructions with back the corresponding Spanish combinations 
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do not arise. This is a type of bilingual convergence that goes beyond lexical bor- 
rowing and calquing of idiomatic expressions, demonstrating more subtle forms 
of language transfer. 


4 Language Contact and Spanish phonetics: Italo- 
Spanish Contacts in Argentina and Uruguay 


Despite the numerous language contact situations that have involved Spanish for 
the past five centuries, demonstrable cases of phonetic changes attributable to 
language contact are relatively scarce, except in the speech of Spanish-recessive 
bilinguals in some Latin American indigenous communities. The latter instances 
have not coalesced into stable speech communities, but rather represent transi- 
tional interlanguage varieties, which dissipate once full fluency in Spanish has 
been attained. A more robust case may be made for the unique “circumflex” or 
“long fall” pitch accent found on tonic vowels in the Spanish of Buenos Aires, 
Argentina and Montevideo, Uruguay. 

Among the many languages other than Spanish carried by voluntary immigrants 
to Spanish America, few produced lasting imprints on Spanish, largely because 
of the relatively small numbers of speakers involved in comparison with the already 
established Spanish dialect zones. A significant exception to this trend is the case 
of Italian immigration to Buenos Aires and Montevideo, a massive demographic 
displacement whose linguistic effects are readily apparent. To give an idea of the 
magnitude of this immigration, nearly 2.3 million Italians emigrated to Argentina 
alone between 1861 and 1920, with more than half arriving after 1900, making up 
nearly 60 percent of all immigration to Argentina. Most of the immigrants ended 
up in greater Buenos Aires (Bailey 1999: 54), and made up between 20 and 30 
percent of that city’s population. As a result of immigration — largely by Italians, 
the population of greater Buenos Aires (including the surrounding countryside) 
grew from 400,000 in 1854 to 526,500 in 1881 and 921,000 in 1895 (Nascimbene 
1988: 11). Similar proportions, scaled down to size, characterize Montevideo for 
the same time period. Italian immigrants were not speakers of standard Italian, the 
result of language planning efforts that had not yet begun in the late nineteenth 
century; they spoke regional dialects and languages, mostly from southern Italy, 
and among the immigrants some dialect leveling inevitably took place, as it does 
in Italy. Given the partially cognate status of Spanish and Italian, interlanguage 
varieties developed that freely combined both Spanish and Italian elements, as 
well as many innovations based on analogy and language transfer. It may well 
have been the possibility for achieving meaningful communication with Spanish 
speakers by making only relatively small departures from their native Italian dialects 
that resulted in long-lasting acquisitional plateaus among Italian immigrants in 
Buenos Aires and Montevideo. 

Immigrants came from all over Italy, which prior to unification in the twen- 
tieth century was truly a patchwork of oftentimes mutually unintelligible regional 
dialects and languages. A speaker of Piemontese could not communicate with a 
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Calabrese unless some linguistic common denominator were found. Nowadays 
standard Italian, based loosely on educated Florentine speech, bridges the gap, 
but in the nineteenth and early twentieth centuries the rural residents who made 
up the bulk of Italian immigration to Latin America had not been the benefici- 
aries of any language planning effort and were usually unaware of any language 
or dialect other than their own. At the same time most immigrants had little or 
no awareness of the social and linguistic conditions that awaited them upon arrival 
at their destination. As a consequence, considerable linguistic improvisation 
and dialect leveling took place among Italian immigrants in Buenos Aires and 
Montevideo, even as they were coming to terms with the reality of their new 
situation. Instrumental in fomenting a pan-Italian linguistic integration was 
the infamous Hotel de los Inmigrantes in Buenos Aires, which at its peak had a 
capacity for over 8,000 newly arrived immigrants at a time (Blengino 1990: 87-8). 
This initial refuge for indigent immigrants was founded in 1883, and consisted of 
a huge tower-like structure which could be seen from on board ships approaching 
the Buenos Aires harbor. Immigrants from all over Italy were thrust together in squalid 
and hugely overcrowded conditions, and for many it was the first occasion to come 
into contact with the true linguistic diversity of Italy itself. The time period spent 
in the hotel, which could last several months or more, had a leveling effect similar 
to that which has been postulated for the Casa de la Contratacion in Seville, where 
emigrants bound for the Spanish colonies in the Americas waited between six 
months and a year for passage on the next available ship, and where leveling of 
the many social and regional dialects of Spain began. 

Among Italians in Argentina a seemingly paradoxical situation obtained. On 
the one hand the extreme diversity of regional dialects impeded communication 
among many Italian immigrants, except though recourse to the emerging common 
second language, Spanish. At the same time each immigrant was able to employ 
a scaffolding of cognate items and similar grammatical structures en route to acquir- 
ing an Italo-Spanish interlanguage. This interlanguage became immortalized in 
the literary cocoliche humorous texts (Meo Zilio 1955; 1956; 1989; Rossell 1970). 

The impact of Italian dialects on the Argentine Spanish lexicon, beginning with 
the underworld slang known as /unfardo and passing into everyday usage, is undis- 
puted. Two other less easily traceable features are also worth considering, one 
segmental and the other suprasegmental. 


4.1 The “long fall” pitch accent of Buenos 
Aires/Montevideo 


In the area of pronunciation, while claims of Italian-like prosody are frequently 
aired, only recently has empirical research been brought to bear on this topic. 
In particular, the notably rising + falling pitch accent on final stressed syllables 
in Buenos Aires and Montevideo Spanish — and now extending to provincial 
varieties in both countries — is impressionistically similar to stereotypical Italian 
patterns. Kaisse (2001) describes the quintessential Argentine “long fall,” in 
which the stressed syllable is significantly lengthened and the tone drops sharply 
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across the elongated vowel. This distinctive pattern is combined with early peak 
alignment of high tones on prenuclear stressed syllables, similar to that found 
in Andean Spanish (O’Rourke 2004). Colantoni and Gurlekian (2004) provide a 
more detailed acoustic analysis of Buenos Aires pitch accents, and combine these 
results with a sociohistorical overview of the Italian presence in Buenos Aires begin- 
ning in the late nineteenth century. According to Argentine observers from the 
time periods in question, the typical portefio intonation pattern did not exist prior 
to the late nineteenth century, which coincides chronologically with the enormous 
surge in Italian immigration. At the same time studies of Italian intonation patterns 
(e.g. D’Imperio 2002 and the references therein) confirm patterns congruent to those 
of modern Buenos Aires Spanish. The circumstantial evidence thus strongly points 
to an Italian contribution to Buenos Aires-Montevideo intonation, not as a simple 
transfer, but as in the case of Andean Spanish, via the creation of innovative hybrid 
patterns that could not be easily extrapolated in the absence of a sustained language 
contact environment. 


4.2 Loss of word-final /s/ in Portetio Spanish 


The other area in which the Italian-Spanish interface may be implicated in 
Buenos Aires-Montevideo Spanish is the realization of word-final /s/. Dialects 
of Spanish represent a cline of pronunciation patterns, ranging from the full sibilant 
realization of syllable- and word-final /s/ (the etymologically “correct” pronun- 
ciation) to nearly complete elimination of all postnuclear /s/. The intermediate 
stages, which represent the majority of the Spanish-speaking world, involve some 
kind of reduced pronunciation, usually an aspiration [h]. In nearly all of Argentina, 
syllable-final /s/ is weakened or elided. Final /s/ is retained as a sibilant in a 
shrinking area of Santiago del Estero, and in a tiny fringe along the Bolivian 
border in the far northwest. Among educated speakers from Buenos Aires, aspi- 
ration predominates over loss, which carries a sociolinguistic stigma (Fontanella 
de Weinberg 1974a; 1974b; Terrell 1978). In word-final prevocalic position (e.g. 
los amigos ‘the friends’), sibilant [s] predominates among more formal registers, 
and in the upper socioeconomic classes. Aspiration or elision of prevocalic /s/ 
carries a sociolinguistic stigma in Buenos Aires, although this configuration is the 
logical result of /s/-weakening, following the route taken by many other Spanish 
dialects (e.g. Lipski 1984). 

In both Buenos Aires and Montevideo, aspiration of word-final /s/ in prevocalic 
contexts (as in los amigos ‘the friends’) still carries a social stigma, although such 
pronunciation is common among working-class speakers, and aspiration or loss in 
phrase-final position is also avoided in carefully monitored speech. Preconsonantal 
/s/ is routinely aspirated in all varieties of Argentine and Uruguayan Spanish, 
with the exception of northern Uruguay along the Brazilian border, where a stronger 
final /s/, influenced by the neighboring Portuguese dialect, still prevails. On the 
other hand complete loss of syllable- and word-final /s/ continues to be highly 
stigmatized in Buenos Aires and Montevideo, and is immediately associated with 
uneducated rural and marginalized urban speakers. The interface with speakers 
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of Italian dialects is at least partially responsible for the extraordinary range of 
/s/-reduction in Rio de la Plata Spanish. None of the Italian dialects implicated 
in contact with Rio de la Plata Spanish contains word-final consonants, although 
word-initial and word-internal /s/ + consonant clusters are common. Moreover 
there are many near-cognates with Spanish in which the only difference is the 
presence of a final /s/ in Spanish and the absence of a consonant in Italian; this 
includes the first person plural verb endings (-mos in Spanish, -iamo in Italian), and 
meno/menos ‘less’, ma/mas ‘but’, sei/seis ‘six’, and many others. These similarities 
provided a ready template for Italian speakers to massively eliminate word-final 
/s/ in Spanish, while retaining at least some instances of word-internal precon- 
sonantal /s/. At the same time the aspirated realization of syllable-final /s/ in 
Argentina/Uruguayan Spanish does not correspond to any regional Italian pro- 
nunciation, and presents a challenge to phonological interpretation. Whereas 
speakers of Rio de la Plata Spanish dialects routinely perceive aspirated [h] as 
/s/, and are often surprised to realize that they are equating sibilant and aspir- 
ated variants, speakers of languages where syllable-final aspiration does not occur 
more often perceive the aspiration as a total absence of sound, and reanalyze the 
Spanish words as not containing /s/. Italian immigrants typically dropped final 
/s/ in such items, even when regional varieties of Spanish realized final /s/ 
as an aspiration. Lavandera (1984: 64—6) confirmed that in the pronunciation of 
Italian immigrants in Argentina, word-final /s/ completely disappears, while pre- 
consonantal /s/ (which is normally an aspirated [h] in Argentine Spanish), is 
retained as a sibilant [s]. This treatment of /s/, which departs drastically from 
Argentine Spanish, duplicates Italian patterns. 

The veracity of the cocoliche literary texts can be put to the test by comparing 
them with contemporary Italo-Spanish contact language. Italian immigration 
surged in Montevideo in the mid twentieth century, around 1950. Some ex- 
amples presented by Barrios (1996; 1999; 2003; 2006; Barrios & Mazzolini 1999; 
Barrios et al. 1994; Ascencio 2003; Orlando 2003) among Italian immigrants in 
Montevideo, all of whom had emigrated from southern Italy in the 1950s: 


(12) a. depué [después] de Pinarola poi kedai biuda, e me beni per centro 

‘After Pinarola I was widowed and then I came here to downtown’ 

b. si, tenia do iko [dos hijos] 
‘yes, I had two children’ 

c. kompramu [compramos] nu kampo, nu...e teniano...[teniamos] e teniamo 
tutto, factamo [haciamos] vino, aciamo [haciamos] tutto 
‘we bought a house in the country, we had everything, we made wine, 
we made everything’ 

d. lu kuatro nietto, aora tengo sei [los cuatro nietos, ahora tengo seis] 
‘the four grandchildren, now I have six’ 

e. otra kosa ke le dammo [damos] a lo canco [los chanchos] 
‘something else that we give to the hogs’ 

f. endonse [entonces] ai etabano [estdbamos] todo los enfermero [enfermeros] 
‘then there we were, all the nurses’ 
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g. ...inosotto [nosotros] ibamo [ibamos] a la kucina a trabaXare 
‘and we went to the kitchen to work’ 


All of these examples combine to implicate the extended interface with Italian 
dialects in the loss of word-final /s/ in lower working-class Spanish of Buenos 
Aires and Montevideo. 


5 Afro-Spanish Phonetics: Equatorial Guinea 


Principally as a result of the Atlantic slave trade, Spanish came into contact with 
numerous sub-Saharan African languages during a period of some four centuries, 
first in Spain and subsequently in the Spanish American colonies (Lipski 2005). 
Despite the fact that in some colonies the population of African origin was 
considerably larger than that of European extraction, little influence of African 
languages on Spanish has been conclusively demonstrated, aside from some 
lexical borrowings. In the Latin American region where the highest proportion 
of people of African origin is found today, namely the Caribbean basin, many 
observers have suggested that the massive elimination of syllable- and word- 
final /s/, /1/, and /r/ and the velarization of word-final /n/ that characterize 
Caribbean Spanish are due to an African substrate. In point of fact, these traits 
are common to all of southern Spain and the Canary Islands, the dialect zones 
that provided the linguistic input for Caribbean Spanish during most of its 
colonial history. 

More likely candidates for African-influenced pronunciation patterns involve 
intonation and pitch accents, which only recently have been the subject of empir- 
ical study. Megenney (1982) noted that the vernacular speech of predominantly 
black communities in the Dominican Republic was characterized by unusual into- 
national patterns, with declarative utterances ending on a mid tone rather than 
the usually falling tone associated with other Spanish dialects. In a recent study 
of the Afro-Iberian creole language Palenquero, Hualde and Schwegler (2008) 
also demonstrate intonational contours that are atypical of any Latin American 
Spanish dialects. In particular all prenuclear stressed syllables receive a uniformly 
high tone, as opposed to the more usual downdrift and alignment of prenuclear 
high tones with the immediately post-tonic syllable. 

An interesting test of the possible African imprint on certain Afro-Hispanic 
intonational patterns comes from considering the only variety of contemporary 
Spanish in contact with African languages, spoken in Equatorial Guinea (Lipski 
1985a; 1990a; 2000; 2004; 2008a; Quilis & Casado-Fresnillo 1995). In this former 
Spanish colony Spanish is the official language, and is spoken as a second language 
by nearly all citizens. All native languages belong to the Bantu family, and are 
characterized by lexical High and Low tones. One common strategy observed among 
most Equatorial Guineans when speaking Spanish is the more or less systematic 
assignment of a different tone to each syllable, often at odds with the simple 
equation tonic stress = high tone and atonic syllables = low tone that prevails in 
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typical contacts between tone languages like those of the Bantu family and pitch 
accent languages like Spanish. This is because in the indigenous languages of 
the country (with the exception of Annobonese creole), every vowel carries a 
lexically determined tone, either high or low. When speaking Spanish, the tones 
are rarely used consistently, so that a given polysyllabic word as pronounced by 
a single speaker may emerge with different tonal melodies on each occasion. What 
results is a more or less undulating melody of high and low tones, at times punc- 
tuated by mid tones and rising/falling contour tones, as in the following example: 


(13) H 4H H H H H 
este pitfisur xjokwandobi njeronlosnixe ranosagi nea 
este ‘pichi’ surgié cuando vinieron los nigerianos a Guinea 
‘This pidgin English came out when the Nigerians arrived’ 


Such a pronunciation is radically different from the more usual intonational patterns 
in monolingual varieties of Spanish, where the pitch register varies smoothly and 
gradually across large expanses of syllables, and where a syllable-by-syllable tonal 
change rarely or never occurs. To the European ear, a syllable-based tonal alter- 
nation as produced by an African learner of Spanish causes a sing-song cadence, 
and may blur the intonational differences between statements and questions. In 
the absence of a perceptible stress accent, syllable-level tonal shifts may obliterate 
such minimal pairs as trabajo ‘I work’/trabajo ‘he/she worked’. Additional examples 
of Guinean Spanish intonation are as follows, where the acute accent indicates 
a high tone, the grave accent a low tone, and no written accent over a vowel 
represents a mid tone (Lipski 2005): 


(14) a. El que tiéne dinéro no habla... 
‘He who has money does not speak’ 
b. Vino él amigo dé su marido. 
‘Her husband’s friend came.’ 
c. Me falta un s6l6 publo qué no hé ido. 
‘There is only one village where I haven’t gone.’ 
d. Puéde durar sus sesénta afios. 
‘It [the palm tree] can last sixty years.’ 
e. Nosdtros pagdmos ménos. 
“We pay less.’ 


These examples do not show a totally consistent tone-to-syllable association, but 
noteworthy non-Spanish intonational patterns are evident from these transcrip- 
tions. Many declarative sentences end on a mid or high tone, and occasionally 
even on a rising tone, in contrast to native non-African varieties of Spanish. Many 
instances of lexical stress accent in Spanish have been reinterpreted as lexically 
preattached High tone. The remaining syllables receive Low tone by default, but 
tone terracing results in superficial mid tones occurring with some regularity 
between High and Low tones. The data from Equatorial Guinea provide a con- 
temporary scenario for the type of pitch accent to tone strategies that may have 
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impacted Spanish in its contact with African tonal languages during the colonial 
period. 


6 Portuguese in Contact 


Spoken in fewer locations than Spanish, the Portuguese language has more 
than 200 million speakers worldwide, spread over four continents; this makes 
Portuguese the sixth most widely spoken language worldwide. In many of these 
venues Portuguese is in contact with other languages, both in border situations 
and in multilingual nations where Portuguese is an official language, normally 
spoken in conjunction with one or more indigenous languages. Substantial 
Portuguese-speaking populations, with a high percentage of Azores origin, are 
found in the United States, Canada, and Australia. Portuguese even enters 
into contact with its linguistic stepchildren, the Portuguese-derived creoles of 
Guinea-Bissau, Cape Verde, SAo Tomé and Principe. A list of the most extensive 
contemporary language contact situations involving Portuguese is: 


(15) a. Portuguese as an official or co-official national language: 
Portugal: contact with Mirandese (Quarteu & Frias Conde 2002), Barran- 
quenho (Alvar 1996; Stefanova-Gueorgiev 1987), Galician, Spanish 
Brazil: contact with Tupi-Guarani, Spanish, numerous languages of the 
Amazon Basin, Veneto, Japanese, English 

Cape Verde: contact with Cape Verdean Crioulo 

Guinea-Bissau: contact with Guinea-Bissau Kriyél, Balanta, and other 
regional African languages 

Angola: contact with Kimbundu, Ovimbundu, Kikongo, other regional 
languages 

Sao Tomé and Principe: contact with Sao Tomense Creole, Principense 
Creole, Angolar Creole 

Mozambique: contact with regional languages, including Macua, Sena, 
Shona, Tsonga, and many others. 


b. Portuguese as a language of border contact or historical colonization: 
East Timor: contact with Tetum 
Goa: contact with Konkani, Marathi 
Macau: contact with Cantonese, also Mandarin and Hokkien 
Uruguay, Argentina, Paraguay, Bolivia, some areas of Peru, Colombia, 
Venezuela: contact with Spanish 


c. Portuguese as a language of immigrants from non-neighboring countries 
United States: contact with English; also a large Cape Verdean colony 
Canada, Australia: contact with English 
Equatorial Guinea: Portuguese (and the respective creoles from Sao 

Tomé and Cape Verde) in contact with Spanish, Pidgin English, and 
indigenous languages 
France: contact with French; also a large Cape Verdean colony 
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By far the most extensive contacts involving Portuguese spoken as a native language 
involve Spanish, in regions along the Brazilian and Portuguese borders. Within 
Spain no Portuguese is spoken along the border with Portugal, although a few 
residual isolates of dialectal Portuguese are found in Extremadura.’ Within 
Portugal, Spanish is used spontaneously in some border villages, although historical 
tensions between Spain and Portugal have precluded the formation of stable 
contact varieties (Boller 1995; Lang 1977; 1982). The situation is quite different along 
the Brazilian border, where Portuguese-influenced hybrid varieties have arisen 
within Uruguay, and to a lesser extent Bolivia and Paraguay. Within Brazil, Spanish 
has made no inroads, due largely to the relative economic hegemony of Brazil 
over its Spanish-speaking neighbors in more remote border areas. As a consequence, 
Portuguese in contact with Spanish is best studied just outside the Brazilian bor- 
ders, in communities where a combination of geographic, historical, political, and 
commercial forces have produced a linguistic symbiosis, affecting all components 
of the language. For purposes of illustration, two rather different Portuguese— 
Spanish contact scenarios along the Brazilian border will be examined here. 


6.1 Portuguese at the edge: Spanish—Portuguese fusion 
in northern Uruguay 


The only known stable varieties of Portuguese in contact with Spanish are found 
in northern Uruguay. The term used by linguists who have studied these vari- 
eties since the 1960s is fronterizo ‘border’, although Elizaincin (1992) prefers the 
more accurate dialectos portugueses del Uruguay ‘Uruguayan Portuguese dialects’, 
since Portuguese is clearly the base language. This series of dialects is widely called 
portufiol by the speakers themselves. These hybrid dialects are not confined to the 
immediate border communities, but penetrate deep into Uruguay, although in the 
last generation Uruguayan Spanish of the Montevideo variety is rapidly displac- 
ing the traditional fronterizo dialects. The reasons for the heavy incursions of 
Portuguese lexical, phonological and syntactic items into Uruguayan speech are 
many,’ and include the fact that for many Uruguayans in this region, better school- 
ing and economic opportunities were traditionally to be found in Brazil. In the 
past, this region was disputed between the newly independent nations of Brazil 
and Uruguay, and was settled by Brazilians for a considerable time. Even during 
colonial times, Portuguese presence in what is now northern Uruguay was always 
significant. The reasons for the formation of a fronterizo dialect, rather than simple 
bilingualism with code-switching and a light overlay of borrowings (as found, 
e.g., in the southwestern United States) are also rooted in a complex set of socio- 
historical facts, in which the rural residents of an isolated and marginalized zone 
were pulled linguistically in two directions, but where neither pull was strong 
enough to completely coalesce into a single base language. 

The fronterizo or Uruguayan Portuguese varieties are characterized by con- 
siderable morphosyntactic and lexical variation, since they are nonprestige oral 
varieties increasingly under pressure from standardized Uruguayan Spanish and 
— both through the media and the recent opening of some bilingual programs in 
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Uruguayan border cities — from standardized Brazilian Portuguese. The variability 
is most noticeable in the choice of lexical items, and also in the juxtaposition of 
Spanish and Portuguese morphosyntactic configurations, but there are also a 
number of common denominators that justify the classification of fronterizo vari- 
eties as cohesive contact-induced dialects. Among the most salient features are 
the following: 


1 Fronterizo phonology combines aspects of the regional varieties of Spanish and 
Portuguese in subtle fashions, being neither the union nor the intersection 
of the two phonological systems, but rather something in between. Thus for 
example the Portuguese distinction between /s/ and /z/ is normally present 
word-internally in items such as casa ‘house’ [kaza] (Spanish [kasa]) but is 
frequently neutralized in word-final prevocalic position even when Portuguese 
functional and lexical items are used (e.g. os amigos ‘the friends’) whereas 
Portuguese requires [z] and Spanish [s] in this environment. Portuguese nasal 
vowels are often realized with a following nasal consonant, although retain- 
ing nasality on the vowel, e.g. Ptg. tem ‘have, exist’ [téj]> [tén]. Portuguese 
nasal diphthongs such as —a6 [@W] also receive a final nasal consonant [awn]. 
The dental consonants /t/ and /d/ have traditionally not palatalized before 
[i] as they do in most prestigious Brazilian Portuguese dialects, but not in 
the neighboring border dialects. However the palatalized pronunciation is 
growing in frequency in fronterizo, apparently through imitation of Brazilian 
television programming (Carvalho 2004a). Fronterizo dialects normally distin- 
guish /b/ and /v/, like Portuguese and unlike Spanish. 

2 Mixing of Spanish and Portuguese articles is found in fronterizo, especially in 
view of the minimal differences between Spanish los, la and las and Portuguese 
os, a and as, respectively. Sometimes this results in combining a Spanish 
word with a Portuguese article or vice versa; on other occasions, both a 
Spanish and a Portuguese article may appear in a single sentence (Elizaincin 
et al. 1987: 41): 


(16) u [=o] material que se utiliza en el taier ‘the materials that are used in 
the shop’ 
tudus lus [= todos los] dias ‘every day’ 
la importasdo de automéviles ‘the importation of autos’ 


3 Vernacular Brazilian Portuguese partially suspends plural marking in noun 
phrases, usually marking only the first element, particularly if it is an article. 
This trait is nearly categorically frequent in fronterizo, even when Spanish arti- 
cles are involved, and can even be found in vernacular Spanish of the border 
region (Carvalho 2006a; Lipski 2006). Some examples are (Elizaincin et al. 1987: 
41-2): 


(17) Aparte tengo unas hermanas, unos tio ‘Besides, I have sisters, aunts and 
uncles’ 
Tein umas vaca para tira leite ‘I have some cows for milk’ 
Sai cum trinta y sei gol ‘I scored 26 goals’ 
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4 Spanish and Portuguese verb conjugations are nearly identical, once 
allowances for pronunciation are made, and fronterizo speakers freely draw 
on verbs from both languages. Vernacular Brazilian Portuguese frequently 
neutralizes all verb endings except for the first person singular in favor of the 
third person singular (e.g. nds trabalha [trabalhamos] ‘we work,’ éles trabalha[m] 
‘they work’), something which does not occur in any (monolingual) variety 
of Spanish. Among fronterizo speakers, combinations like nds tinha ‘we had’ 
[standard Ptg. nds tinhamos] instead of nosotros teniamos may be heard (Rona 
1969: 12; Elizaincin et al. 1987). Significantly, there are no instances of this 
gravitation toward the third person singular as quasi-invariable verb stem in 
fronterizo verbs produced in Spanish. At the same time some fronterizo speakers 
occasionally employ the third person singular instead of the first person sin- 
gular, something that does not occur in any non-creole variety of Portuguese: 
entonci yo no tein [tenho] ese dinheiro ‘then I don’t have that money’. As with 
other neutralizations of verb person and number endings, this only occurs with 
Portuguese verbs. Fronterizo speakers have also created an innovative first per- 
son plural verb form for first conjugation verbs ending in -ar; instead of the 
normal -amos (often pronounced as -amo in vernacular Brazilian Portuguese), 
fronterizo speakers consistently employ the ending -emo, normally the subjunctive 
ending for Portuguese first conjugation verbs: falemo ‘we speak’, trabalhemo ‘we 
work’, moremo ‘we live’. 


6.2 Spanish—Portuguese contacts along the 
Bolivian-Brazilian border 


The northern Uruguayan fronterizo dialects are the only stable Spanish—Portuguese 
hybrid varieties in South America, but at other points along the Brazilian border 
Spanish and Portuguese interact under varying conditions of bilingualism. 
One scenario is Cobija, in northwestern Bolivia, on the Acre River which forms 
the border with the Brazilian state of the same name. Cobija (population of 
around 22,000 in the 2001 census) and its sister city Brasiléia (population around 
16,000) are linked by bridges which carry both vehicles and pedestrians. The 
border is open; there are no tolls and no documentation need be presented on 
either side of the bridges. Nowadays the main economic force in Cobija is trade 
with neighboring Brazil; Cobija has a large duty-free shopping area near the main 
international bridge, and every day hundreds of Brazilians flock to downtown 
Cobija to buy a wide range of imported and national products, all of which can 
be purchased at favorable prices due to the relative strength of the Brazilian real 
with respect to the boliviano as well as the absence of tariffs and duties. The socio- 
linguistic history of Cobija is not unlike that of northern Uruguay. Both regions 
were long ignored by distant central governments. In both regions the economy 
of neighboring Brazilian towns was more highly developed, with better schools, 
hospitals and clinics, and better transportation. Until the arrival of cable televi- 
sion and the building of local radio stations, the only radio and television stations 
available in northern Uruguay and northern Bolivia were Brazilian. In more recent 
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times, the establishment of tax-free commercial zones and the relative strength of 
the Brazilian economy compared to neighboring countries has attracted many 
Brazilians to border cities in Uruguay and Bolivia. 

Many Portuguese words and expressions are used in the Spanish of Cobija 
(Saavedra Pérez 2002: 143-53), and young people frequently greet each other 
with hybrid expressions like ;qué tu ta fassendo aqui? ‘what are you doing here?’ 
and tu é muito bonita ‘you are very pretty’. These combinations reflect the use of the 
second person singular subject pronoun fu in the regional Brazilian dialect of Acre, 
compared with the use of vos and the corresponding verb forms, in northern Bolivian 
Spanish.’ Nearly everyone in Cobija says bora instead of vamos ‘let’s go’, from 
Portuguese vamos embora: bora tal lugar ‘let's go to that place’. Parents are referred 
to by the Portuguese words pai and mai, even in families where only Spanish is 
spoken. As in northern Uruguay, residents of Cobija often use ta to indicate approval 
and ¢todo bien? as a greeting. When speaking Spanish, some residents of Cobija 
use double negation, reflecting vernacular Brazilian Portuguese: aqui no hay no 
‘here there is nothing’, no sé no ‘I don’t know’. There are occasional non-inverted 
questions, also reflecting Brazilian Portuguese syntax: ;dénde vo(s) vivi(s)? ‘where 
do you live?’ as well as in hybrid sentences such as the aforementioned jqué tu 
ta fassendo aqui? In situ questions, frequent in colloquial Brazilian Portuguese, 
sometimes occur in the Spanish of Cobija: ;Vo(s) vivi(s) dénde? 

Given the daily presence of Brazilians in Cobija, the fact that most children in 
Cobija prefer Brazilian television programs (and many Bolivian adults watch 
Brazilian soap operas), most cobijefios can speak at least some Portuguese. Some 
residents of Cobija speak Portuguese nearly flawlessly, particularly those married 
to Brazilians or who have lived extensively in Brazil. More common is the use of 
Spanish phonotactics and morphosyntax when attempting to speak Portuguese. 

In the past decade and a half, the founding of the Universidad Amazoénica del 
Pando in Cobija has attracted hundreds of Brazilian students, particularly in the 
fields of computer science and agro-forestry. Some Brazilian students marry 
Bolivians and establish bilingual households in Cobija. All Brazilian students are 
required to take intensive courses in Spanish in order to survive in the Bolivian 
classroom environment. As occurs in other language contact environments 
between the two closely related languages, Brazilians in Cobija rarely master the 
Spanish language, but rather speak a range of second language approximations 
and spontaneous hybrid idiolects that many residents of Cobija regard as portufiol. 
Grammatically, Brazilians’ attempts at speaking Spanish are characterized by the 
same interweaving of Spanish and Portuguese elements as found among Cobija 
Spanish speakers. 

A quite different sociolinguistic configuration obtains in the other major Bolivian 
city on the Brazilian border, Guayaramerin, in the department of Beni (Crespo 
Avaroma 2006). Guayaramerin (population around 41,000 in the 2001 census) is 
separated from its Brazilian counterpart Guajaré-Mirim (population around 
38,000) by the wide and often turbulent Mamoré river, a river so wide from the 
river bank the opposite city can barely be made out. The towns are serviced 
by a regular motor ferry service, a journey that takes around 20 minutes. The 
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presence of a duty-free shopping zone in Guayaramerin and a favorable 
exchange rate result in the Bolivian city being filled with hundreds of Brazilian 
tourists every day, in the shopping area that stretches along the main avenue 
from the port terminal for some ten blocks. Relatively few Bolivians travel on a 
regular basis to the neighboring Brazilian city; the cost of transport, the higher 
prices, the reluctance to ride in small boats, and the need to present a yellow 
fever vaccination certificate upon entering Brazil account for the asymmetrical 
patterns of tourism. All Bolivians engaged in commerce with Brazilian tourists 
in Guayaramerin speak some Portuguese, with the same second language traits 
found in Cobija. 


6.3 Portuguese-Spanish language mixing as congruent 
lexicalization 


The sociolinguistic situations are quite distinct in Cobija, Guayaramerin, and north- 
ern Uruguay; in the first two cities Spanish is the principal language, there is almost 
no Spanish—Portuguese code-switching, and when residents attempt to speak 
Portuguese they exhibit variable and idiosyncratic patterns of first language 
interference in accordance with their individual level of competence in Portuguese. 
In northern Uruguay, the fronterizo dialects are spoken natively, and the mixture 
of items derived from Spanish and Portuguese is quite consistent. Despite these 
differences, the superficial patterns of language mixing in the three speech com- 
munities are quite similar, as shown in the following examples (Spanish words 
are in regular typeface, Portuguese words are in italics, cognate homophones — 
allowing for differences in spelling and low-level phonetic differences — are in 
bold, and neologisms combining both Spanish and Portuguese elements are 
underlined): 


(18) a. Cobija: Bolivians’ attempts to speak in Portuguese: 
vocé nao ta entendendu lo que quiere decir 
‘you don’t understand what that means’ 


eu acho que voy, mas primero tenho que... 
‘T think that I’m going, but first I have to...’ 


b. Cobija: resident Brazilians’ attempts to speak in Spanish: 
tamen tive, una relaci6n con Paraguay, entonce volvi aqui a Cobija toy 
vivindo cuatro mese 
‘Talso had a relationship with Paraguay, then I came back here to Cobija, 
I’ve been living here for four months’ 


doh mil doh tamén empecé, tivi qui viajar, doh mil treh tamén entrei 
informatica 

‘I began in 2002, I had to travel, then in 2003 I entered the program in 
computer science’ 
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c. Guayarmerin: Bolivians’ attempts to speak in Portuguese: 
porque nao tem, como le puedo falar, vitrina 
‘because there isn’t any, how can I explain to you, display case’ 


mas algunoh brasileiro entendem lo que hablamoh nosotro loh boliviano 
‘but some Brazilians understand the way we Bolivians speak’ 


vejo las novelas, o jornal 
‘I watch soap operas and the news’ 


d. Rivera, Uruguay: fronterizo/portunhol speech: 
donde fica tal cosa? 
‘Where is that thing?’ 


voy passar pa [x]ubilacdo 
‘Tm going to take retirement’ 


entonci yo no fein ese dinheiro 
‘then I don’t have that money’ 


This mixed language is not the result of code-switching but rather of involuntary 
mixing of the target language and the native language during attempts to speak 
entirely in the target language. The Uruguayan fronterizo dialect is not currently 
a participant in a code-switching environment, but historically it probably 
derives from a sociolinguistic environment similar to the characteristic speech 
of northern Bolivia. In addition to having a high density of Spanish—Portuguese 
juxtaposition, the aforementioned Spanish—Portuguese hybrid combinations 
appear to violate well-documented syntactic constraints on intrasentential code- 
switching. This is true along the Bolivian—Brazilian border representing attempts 
to speak entirely in Spanish or Portuguese, and also in the stabilized and natively 
spoken Uruguayan fronterizo. Some typical examples are: 


(19) a. Between a pronomonial subject and predicate: 
Cobija: 
sei la yo ‘I don’t know’ 
Cobija, Brazilians attempting to speak Spanish: 


ela decia ‘nostra’ ‘she would say “nostra” ’ 
yo tamben tive ehpafol alla ‘I also had Spanish there’ 


que yo saiba parece que vai ser por su cuenta 
‘as far as I know it seems like it will be on (their) own’ 


Guayaramerin: 
ellos ya misturam ‘they mix’ 


b. Between negative words and main verb: 
Cobija: 
¢ mas vai 0 no vai? ‘Are you going or not?’ 
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c. Between fronted interrogative words and the remainder of the sentence: 
Cobija, Brazilians’ Spanish: 
quien quere ter mah conocimiento ‘whoever wants to know more’ 


d. Between auxiliary verb and infinitive: 
Guayaramerin: 
entonces ellos aprendieron que no hay que trocar a moeda 
‘then they learned that it is not necessary to change money’ 


porque nfo tem, como le puedo falar, vitrina 
‘because there isn’t any, how can I explain to you, display case’ 


In Cobija and Guayaramerin these combinations are not part of a stable mono- 
lingual grammar nor do they result from code-switching in fluent balanced bilin- 
guals; they are rather the idiosyncratic approximations to a partially acquired 
cognate language produced by speakers with only limited bilingual competence. 
Many of the same morphosyntactic juxtapositions also occur in fronterizo, which 
is spoken natively and with consistent grammatical and lexical patterns; the 
following examples obtained in Rivera, Uruguay illustrate the superficial simi- 
larities between the second language Portuguese or Spanish produced along 
the Bolivian—Brazilian border and the first language hybrid speech of northern 
Uruguay: 


(20) a. Between pronominal subject and predicate: 
yo no fein ese dinhero entonci yo no tein ese dinheiro ‘then I don’t have 
that money’ 
[3]o no vou me aposentar ‘I’m not going to retire’ 


b. Between negative word and main verb: 
yo no tein ese dinhero ‘I don’t have that money’ 


c. Between fronted interrogative word and the remainder of the sentence: 
iDo6nde fica tal cosa? ‘where is that thing?’ 


d. Between auxiliary verb and infinitive: 
Y se dificulta mais aprender 0 espanhol ou o portugués. 
‘And it’s harder to learn Spanish or Portuguese.’ 
Na escola é donde éles decidem agarrar espaiiol. 
‘In school it’s where they decide to take Spanish.’ 


The Spanish—Portuguese language interleaving just described fits the basic profile 
of congruent lexicalization, as defined by Muysken (2000), despite the fact that none 
of the three cases involves fluent bilingualism, interaction with bilingual inter- 
locutors, or any conscious decision to use more than one language or dialect in a 
conversation. All three cases fully conform to the notion of words “inserted more 
or less randomly” (Muysken 2000: 8). In fact the “more or less random” nature of 
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language mixing is much more apparent in the Spanish—Portuguese cases examined 
here than in any of the instances of fluent bilingual language mixing adduced by 
researchers who have adopted congruent lexicalization as a category of language 
switching (e.g. Deuchar, Muysken, & Wang 2007). The apparent randomness of the 
language mixture in Bolivia is due not only to the high degree of shared structures 
between the two languages but to the limited proficiency in the second language, 
which results in “filling in the gaps” by means of words from the speakers’ first 
language. Indeed, it is quite likely that Uruguayan fronterizo originally arose in 
precisely the same fashion as in contemporary northern Bolivia, i.e. when speak- 
ers of Spanish attempted to speak Portuguese without having fully acquired 
the language. The linguistic history of northern Uruguay is compatible with this 
scenario. Until 1862 the northern region of what is now Uruguay was a disputed 
territory populated entirely by Brazilian squatters. Beginning in 1862 the 
Uruguayan government began a deliberate settlement effort, sending internal 
colonists from the populated south in order to establish de facto occupancy of the 
northern border. Only Portuguese was spoken in this region until well into the 
second half of the nineteenth century (Elizaincin 1992: 99-100; Carvalho 2006b; 
Behares 1984a; 1984b). Spanish-speaking Uruguayans arriving from the south would 
have been faced with a disadvantageous situation, both linguistically and socio- 
economically. The official national language, Spanish, was a recently injected 
minority language in northern Uruguay, numerically and sociolinguistically 
dominated by Portuguese. At the same time the resident Portuguese speakers were 
ethnically Brazilian and regardless of declared citizenship (a barely meaningful 
term in the mid nineteenth century), they identified with the neighboring giant 
nation. Once villages and towns were settled in northern Uruguay, the cultural 
and economic domination of Brazil was even more evident; schools, newspapers, 
medical facilities, and even consumer goods were available principally in Brazil, 
and the Portuguese language dominated northern Uruguay. Arriving Spanish 
speakers would have had to learn Portuguese, perhaps reluctantly, and at least 
during the first generations, without opportunities to formally study the language. 
Northern Uruguayans were far removed from Spanish-language media, at first 
newspapers, later radio and television. Passive competence in Portuguese would 
be high from the outset, but the Spanish language always trickled in, and was 
never entirely displaced by Portuguese. The emerging fronterizo dialect would have 
coalesced as Uruguayan Spanish speakers attempted to speak Portuguese, com- 
bined with local Portuguese-speaking residents’ gradual acquisition of Spanish. 
Both groups would fall back on the cognate lexical items and syntactic structures 
that characterize learners’ portuviol every time the two languages come into contact. 
Perhaps becoming aware of their own emerging ethnolinguistic identity, succes- 
sive generations of northern Uruguayans evidently stopped short of fully acquir- 
ing Portuguese (or Spanish), even in environments where this may have been 
possible. The end result is a speech form which is cohesive and consistent 
enough to have produced a folk literature and musical production, and to have 
produced highly ambivalent feelings among speakers and observers alike. 
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7 Conclusions 


The preceding sections, while representative of the range and scope of language 
contacts involving Spanish and Portuguese, cover only a tiny fraction of the myriad 
environments in which these two languages share the linguistic space with other 
tongues and peoples. In view of the widely varying sociohistorical, demographic, 
and linguistic circumstances, there are few common denominators, other than those 
that define all language contact environments. With over half a billion native or 
near-native speakers worldwide, Spanish and Portuguese are prime exemplars of 
the multitude of potential outcomes from language encounters. These encounters 
have been instrumental in shaping the spread and diversification of Spanish and 
Portuguese in times past, and they continue to play an important role in con- 
temporary speech communities. 


NOTES 


1 Unlike in the Andean dialects and in Mexico and Central America, in Paraguayan Spanish 
clitic doubling is never found with inanimate direct objects. In both Ecuador and Paraguay, 
the default masculine direct object clitic is Je (usually an indirect object clitic in other 
Spanish dialects) rather than lo. 

2 In Puno, Peru, for example, the sociodemographics strongly favor the possibility of 
indigenous linguistic transfer to Spanish, since as much as 90 percent of the popula- 
tion speaks Quechua or Aymara. Benavente (1988) found that among university students 
in Puno, acceptance of clitic-doubled constructions occurred at levels of 70-80 percent 
and even higher, including non-agreeing lo as in gquién lo; tiene la llave;? ‘who has the 
key?’, impossible in other Spanish dialects. Moreover, bilingual speakers accepted 
these combinations more readily than did monolingual Spanish speakers. Godenzzi (1988), 
also studying the Spanish of Puno, obtained comparable results. Clitic doubling was 
preferred among the lowest socioeconomic sectors (in which indigenous speakers are 
overrepresented). 

3 In the southern Andean region of Peru, and in northwestern Argentina, the invariant 
clitic lo can be used with intransitive motion verbs and occasionally other unergative 
verbs (Granda 1993; Godenzzi 1986): ya lo Ilegé ‘[he] arrived’; ya lo entro ‘[he] entered,’ 
ya lo murio ‘[he] died’. Cerrén-Palomino (1976) has suggested that this pleonastic lo is 
a direct calque of the Quechua verbal suffix -rqu, which connotes outward motion. 

4 In Quechua, the case marker -ta has other functions, including adverbial and locative 
uses. It is also used to signal direct objects in certain double-object constructions 
involving verbs of helping and teaching. In nearly all instances, however, ta does not 
appear in immediate preverbal position, nor in any other single canonical position that 
might cause -ta to be calqued by an object clitic in Andean Spanish. Postnominal -ta 
may also be followed by other enclitic particles in non-dative constructions, in effect 
being “buried” among the clitics and not corresponding in any clear way with a 
Spanish element. Only in the case of accusative -ta is the linear order convergent enough 
with Spanish CLITIC + VERB combinations to make transfer feasible. 

5 www.galeon.com/hablasdeextremadura. 
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6 Traditional studies of the fronterizo dialects include Academia Nacional de Letras 
(1982), Elizaincin (1973; 1976; 1979; 1992; Elizaincin, Behares, & Barrios 1987), Hensey 
(1972; 1975; 1982a; 1982b), Rona (1960; 1969). Carvalho (2003a; 2003b; 2004.a; 2004b) pro- 
vides contemporary variational and sociolinguistic analyses of this complex language 


contact environment. 


7 Although the regional Portuguese dialect of Acre uses the pronoun tu, the verb forms 
correspond to the third person pronoun vocé used in most other Brazilian dialects: 


tu foi, tu trabalha, etc. 
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28 Contact and the 
Development of the 
Slavic Languages 


LENORE A. GRENOBLE 


1 The Slavic Languages and Contact 


The speakers of Slavic languages have expanded over a considerable amount of 
territory, stretching from Sorbian communities in Lusatia (in the German states 
of Saxony and Brandenburg) in the West to the far eastern border of Russia on 
the Pacific ocean in the East, from the Russian borders of the Arctic Ocean in the 
North to the Macedonian borders with Greece in the South, thus spanning from 
the heart of Western Europe all across Asia. The Slavic languages are generally 
classified in terms of three branches which capture not only their genetic 
classification but also, roughly, their geographic distribution: East, West, and South. 
The modern East Slavic languages are Belarusian, Russian, and Ukrainian; the West 
are Czech, Slovak, Polish, Kashubian (classified by some as a dialect of Polish), 
Lower Sorbian, Upper Sorbian (also called Wendish or Lusatian); and the South 
Slavic languages are Bulgarian, Macedonian, Bosnian-Croatian-Serbian (or BCS), 
and Slovenian." To this group we can add Rusyn,’ an East Slavic variety with strong 
influence from Slovak whose speakers live in Ukraine, Slovakia, Serbia, and 
Croatia. Following other criteria, however, it is possible to classify Slavic into North 
and South groups, a classification which places the East and West Slavic languages 
together in distinction to the South Slavic languages. Many of the criteria used 
for the North-South grouping stem from contact-related phenomena. 

In addition to these living languages is Old Church Slavonic, which is now extinct. 
It is a liturgical language created in the late ninth century based on the South Slavic 
variety spoken in the area of Moravia where the Slavs were first Christianized. 
Because of its importance in the spread of Christianity and literacy, it has played 
a significant role in the development of a number of Slavic languages, and has 
served as a vehicle for introducing borrowings from non-Slavic languages, espe- 
cially Greek. Polabian and Slovincian, both West Slavic, are also now extinct. 

The range and extent of contact-induced phenomena vary according to time and 
language and are often difficult to assess. By and large cases of lexical borrow- 
ing are relatively clear, in terms of what is the source and what is the target. But 
in other areas of potential contact-induced change, e.g. phonological, morphological, 
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or syntactic phenomena, it can be impossible to prove without question that a 
given phenomenon or feature is the result of contact and not independent inno- 
vation or shared inheritance. This is perhaps particularly true for the impact of 
one Slavic variety upon the other, where the genetic and typological properties 
of both are extremely close to one another. Additional ambiguities are introduced 
by the fact that some important contact phenomena occurred during the prehis- 
toric period. 


2 Prehistoric Contact 


The Slavic prehistoric era can be divided into two phases based on a series of 
sound changes: the early period, from approximately 3000 to 1000 BCE; and the 
more recent period, roughly from the 300s to the 900s CE. During this more recent 
time we find changes common to all Slavic dialects as well as changes which 
indicate the break up of Slavic around 900 CE, when we have the first written 
documentation of Slavic. Historically this period is flanked by the expansion of 
the Goths into the Black Sea region at its beginning and with Christianization 
of the Slavs and the development of a Slavic written language (Andersen 2003: 
46). The region also served as a corridor for the migration of a variety of tribes 
(Turkic Huns, Bulgars, Pechenegs, and the Altaic Magyars) into Europe; there are 
few traces of lexical borrowings from these groups into Slavic but they may have 
had an impact on Slavic phonology (see Galton 1994). There are two sources of 
loanwords in prehistoric times which deserve special discussion — Germanic and 
Iranian (especially Scythian). 


2.1 Iranian 


Iranian contact, in particular with Scythian and Sarmatian tribes, appears to 
have taken place in the area of what is now southern Russia from approximately 
700 BCE to approximately 300 CE, although some apparent loanwords may actu- 
ally be cognates. The correspondences between Early Common Slavic and Iranian 
phonology make it hard to distinguish whether some similar items are cognate 
or borrowings (Andersen 2003; Zaliznjak 1963). A large percentage of these 
borrowings can be organized into four semantic categories: religion (e.g. bog ‘god’, 
div ‘demon’, gatati ‘to divine’, rajv ‘paradise’, svetv “holy’); law, society, and mor- 
ality, broadly defined (kajati se ‘to repent’, zvl’v ‘bad, evil’, cust ‘honor’); and health 
and body parts (*svdorvv ‘healthy’, *porsi ‘breast’, *goldv ‘hungry’). Such broad 
groupings constitute approximately 45 percent of all Iranian borrowings; the remain- 
ing include such words as radi ‘for the purpose of’, sobaka ‘dog’, and xvala ‘glory’. 
Many Iranian loanwords are not found in all Slavic varieties but are limited to 
certain areas, despite the fact that they appeared to have been borrowed early 
on (Trubaéev 1977). A singular example is the word vurdoljakv ‘werewolf’, which 
was borrowed into most or all Slavic languages but has undergone change to 
varying degrees in the different Slavic dialects (Nichols 1993: 387). 
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2.2. Germanic 


There are a large number of Germanic loanwords in the prehistoric period which 
can be seen as coming in separate waves. The absolute total number of these is 
unclear but in the range of 150, although there are only about 50 “secure” examples 
(Andersen 2003). From the standpoint of Balto-Slavic contacts, Germanic loanwords 
can be divided into three groups: those found only in Baltic, those found only in 
Slavic, and those found in both. The first Germanic-Slavic contact affected only 
some Proto-Slavic tribes dwelling in the sub-Carpathian zone, while later contacts 
were more extensive and lasted longer. Contacts with the Goths affected practic- 
ally all the prehistorical Slavic tribes. In many instances it is difficult to pinpoint 
the date and precise source of a borrowing due to the duration and prehistoric 
nature of these contacts. There are numerous borrowings in technical termino- 
logy from Germanic (as well as Italic and Celtic) into Slavic but not Baltic. It should 
first be noted that the existence of loanwords (or cognates) in Baltic and Slavic is 
often invoked in favor of a shared heritage or, alternatively, in favor of extensive 
contact. 

The earliest evidence of Germanic contact comes from an inscription and 
appears to date to the first centuries AD. The loanwords found on the inscription 
cannot be attributed to any specific Germanic group. The second layer corresponds 
to contact between the Goths and Slavs when the former settled in the region north 
of the Black Sea, in approximately 200-300 AD. With the westward expansion of 
the Slavs, from the 400s on we find West Germanic loanwords, primarily from 
High German. Representative early Germanic borrowings are duma ‘thought’, Goth 
doms ‘judgment’; gotoviti ‘to prepare’, Goth gataujan; kupiti ‘to buy’, Goth kaupon; 
or Selm ‘helmet’, PGmce *helmaz. Later loanwords are not found in all Slavic dialects: 
bl'udo ‘dish’, Goth biups; buky ‘script’, Goth boka ‘letter’ (Schenker 1995: 159). See 
in particular Gotab (1991) and Kiparsky (1934) for a more detailed discussion of 
loanwords and more examples. 


3 Finno-Ugric Contact and the Finno-Ugric 
Substrate in Russian 


The question of a Finno-Ugric substrate in Russian is a matter of some debate, 
but it is fairly clear that at least some features of Modern Russian are the result 
of contact with Finno-Ugric. That there have been longstanding contacts between 
the different speakers is not in doubt. The regions of central and northern Russia 
were populated by a number of Finno-Ugric tribes; speakers of these languages 
spread as far west as Finland and Estonia. The Slavs expanded into the east and 
north into Finno-Ugric territories in the fifth century and to this day a large num- 
ber of toponyms in central and northern Russian are Finno-Ugrian, as in the city 
of Vologda (with stress on the first syllable, as in Finnish), or Lake Ilmen or 
Lake Ladoga. 
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As with other issues, the question of Finno-Ugric impact can be divided into 
those instances where the evidence of the substrate is clear, and where it is prob- 
able but not as clear. In both sets of cases, there is a difference between standardized 
Russian (CSR) and the dialects. (In northern dialects where variation has not been 
eliminated due to the influence of the standard, the number of clear influences is 
even greater.) Veenker (1967) cites three clear cases of Finno-Ugric contact in CSR: 


1 akan’e, or the so-called “reduction” of unstressed vowels, as in moloko ‘milk’ 
[malako]; the spelling retains the etymologically correct vowel /o/ in all three 
syllables; 

2 the so-called nominal sentence, or the loss of the copula in the present tense, 
as in ja celovek ‘Tam a man’, lit. ‘I— man’, a pattern found throughout all Finno- 
Ugric languages except those in the Baltic subgroup; 

3 loss of the verb ‘to have’ (imet’) except in scientific prose and a few set 
phrases. In this regard Russian is distinct from all other Slavic languages, 
where it is regularly used. The Russian normal verbal construction for pos- 
session is one with the possessed as the nominative subject and the posses- 
sor in a prepositional phrase, with the preposition u ‘at, by’ which governs 
the genitive case. 


Finno-Usric influence also is most probable in CSR in the use of a partitive gen- 
itive and the use of a locative (-uv) in masculine and neuter singular paradigms 
of some nouns, which is in distinction to the prepositional case -e for these nouns 
(e.g. o lese ‘about the forest-PREP’ versus v lesu ‘in the forest-LOC’), while in most 
nouns the endings are homophonous. Probable Finno-Ugric influence in CSR is 
found in the use of comitative constructions, such as my s vami ‘you and I’, lit. 
‘we with you-INST.PL’ or my s Zenoj ‘my wife and I’, lit. ‘we with wife-INST.PL’. 

Impact of Finno-Ugric in some northern Russian dialects is seen in certain sound 
changes: (1) diphthongization of the vowels 0 > oa and e > ia; (2) cokan’e or the 
collapse of the palatals c and ¢ into one phoneme; and (3) stress on first syllable 
(dialect pdasla, CSR posla ‘[she] went’. There are also morphological changes: 
(1) comparative of substantives (e.g. bereZee ‘closer to the shore’, from the noun 
bereg ‘shore’); (2) the nominative object; and (3) certain loaned suffixes, such as 
the causative -tta, -itta from Finnish. 

Finno-Ugric influence on the northern dialects is probably also seen in the 
development of postpositive definite articles (e.g. -ot, -ta, -to, -ti, -te), which are 
arguably used under the influence of Komi-Zyrian definite-possessive suffixes 
(Tiraspol’skij 1998). Another construction frequently attributed to Finno-Ugric 
contact is use of the construction of the type u nego uexano, with the preposition 
u plus genitive of the logical subject and a past passive participle in the neuter 
singular nominative instead of a finite past tense verb form, e.g. on uexal ‘he has 
departed’. In north Russian dialects this construction was prevalent with both 
transitive and intransitive verbs. It has also been argued that this list could be 
expanded by other changes, such as use of [p] instead of [f], and the development 
of perfect and pluperfect tenses. 
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Even the clear evidence of a Finno-Ugric substrate is not entirely straightforward. 
One of the more contested claims is the absence of the verb have in modern Russian. 
Its lack has alternatively been explained as contact-induced change or as a relic 
of the earlier Common Slavic state. Although Indo-European is generally believed 
to have been lacking a have verb, the modern Slavic languages — with the excep- 
tion of Russian — have one as a subsequent development. The question is whether 
Russian also had the verb and lost it or if it represents a more conservative stage 
of Slavic, arguably as part of a core—periphery change, with Russian removed from 
the center of innovation. Evidence in favor of this latter argument is the fact that 
in OCS texts all instances of iméti ‘have’ are loan-translations from the Greek. In 
favor of the former theory, the existence of a pan-Slavic verb imeéti ‘have’ which 
was lost through contact with Finno-Ugric tribes comes from the Baltic languages, 
as Latvian — which had Finno-Ugric contacts — also uses a be construction for 
possession, while Lithuanian — which did not — uses a have verb (Isaéenko 1974; 
Kiparsky 1969; see Dingley 1995 for a review of the arguments). 


4 Contact in the Early History of the Slavs 


We have no written records of Slavic prior to the mid ninth century, and the ear- 
liest data are only brief inscriptions. The first manuscripts which have survived 
come from the early tenth century and are almost exclusively translations from 
the Greek gospels. The language used in these manuscripts is Old Church Slavonic, 
a liturgical language created as part of a mission from Byzantium to Christianize 
the Slavs which sent two missionaries, the brothers Constantine (later known as 
Cyril) and Methodius, to Moravia in 862-3. Bilingual in Greek and the South Slavic 
variety spoken in Salonika at the time, Cyril and Methodius created a largely 
artificial, liturgical language and begun their work by creating translations from 
Greek, and the oldest OCS texts are translations of the gospels. Thus Greek had 
a major impact on the structure of OCS. 

As Christianity spread among the Slavs, so too did OCS and its use in the liturgy. 
The modern Slavic languages can be divided according to the orthographic 
system they use today, Cyrillic or Roman. This division maps almost perfectly 
onto religious divisions, and those languages which use the Roman alphabet are 
primarily spoken by Catholics, and those using the Cyrillic alphabet are primarily 
Orthodox Christians (with the notable exception of some Muslim groups, in par- 
ticular in South Slavic territory). The former group initially used Latin for its liturgy 
and the latter Old Church Slavonic. This dichotomy more or less corresponds to 
the initial source of religion and thus liturgical texts for each group, although not 
perfectly. The languages using the Roman alphabet are all of the West Slavic 
languages, and Croatian and Slovenian in South Slavic territory. Those using the 
Cyrillic alphabet are the East Slavic languages, Belarusian, Russian, and Ukrainian, 
along with the South Slavic languages Bulgarian, Macedonian, and Serbian. 

Old Church Slavonic played a particular role in the development of Russian 
and had a major impact on its structure. It was less central in the development 
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of Belarusian and Ukrainian, which show a heavier Polish and German influence, 
due to their respective histories. Ukraine and Belarus were taken over by Lithuania 
and Poland, beginning with the annexation of Polotsk in the tenth century and 
continuing well into the fourteenth century. 1386 marks the inception of the 
Lithuanian Union with Poland, and the time when the Lithuanian ruling class 
effectively became Polonized. As a result, the impact of OCS was less strong 
for Belarusian and Ukrainian than Russian, which show a heavier Polish and 
German influence, due to their respective histories. 

The East Slavs received this written language in the tenth century with the 
conversion in 988-9 of Prince Vladimir in Kiev to Christianity. A century later, 
it is clear from historical records that Church Slavonic had lost its unity and was 
replaced by regional variants, but it continued to be a primarily South Slavic 
language. Moreover, as the ecclesiastical language it was seen as sacred, in terms 
of both linguistic and orthographic form. It was thus very conservative and 
major changes in Church Slavonic are only seen with deliberate attempts to rid 
it of regional or “debased” elements and bring it to its “pure,” more Hellenistic 
form (although perceptions of what constituted “pure” Church Slavonic varied 
over time). 

At this stage, the phonemic and morphosyntactic differences between OCS and 
East Slavic were not so great as to prohibit comprehension,’ although OCS had 
a large, specialized lexicon that was lacking in East Slavic. OCS was also charac- 
terized by several South Slavic sound features, such as the use of ra < “ar, la < *al, 
in distinction to the pleophonic forms of ES] (-oro, -olo), and the consonant clusters 
St!, Zd! (versus the ESI forms s/¢ or ZZ/). These differences have had a profound 
impact on Russian and can be readily found in paired lexical items, where the ESI 
term refers to the ordinary, everyday object and the Church Slavonic form to a more 
elevated or scientific term, as in R moloko ‘milk’ versus mlekopitajuscij ‘mammal’ 
or rovnyj ‘even’ versus ravnyj ‘equal’, ravenstvo ‘equation’. The biggest differences 
between the two languages were found more at the level of macrosyntax and 
involved differences in clause combining, e.g. coordination and subordination, 
the use of infinitival and participial constructions, and the dative absolute. Such 
differences are not surprising given that spoken East Slavic was not used for 
writing and that OCS was not only a liturgical language, but one built on the 
syntactic forms of Greek. 

Church Slavonic continued to dominate as the written language in first Rus’ and 
then Russia for many centuries. In fact its significance went through a revival in 
the Muscovy period in what is know as the “Second South Slav influence.” By 
this time, spoken Russian had diverged so much from Church Slavonic that it is 
appropriate to consider it an entirely different language. The period of high Muscovy 
(in the fifteenth and sixteenth centuries) can be considered a period of diglossia, 
with Russian used as the spoken language and in all everyday domains and Church 
Slavonic as the written language used in formal, administrative, and ecclesiast- 
ical domains and increasingly inaccessible to all but the most educated elite. The 
Second South Slavic influence, an attempt to purify Church Slavonic as used in 
Muscovy, brought with it a highly artificial literary style based on the elaborate 
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and equally artificial Greek style of the time. Etymologically correct Church 
Slavonicisms replaced East Slavicisms which had crept into the language, includ- 
ing forms in -2d- such as prezde ‘formerly’ which replaced pereze and preze, or pobezdén 
‘conquered’ for pobeZen; or the alternation between vremja ‘time’ and veremja is 
lost, with only the former, Slavonic form used. These changes remain in Russian 
to this day. 

Church Slavonic was not the only language to have an impact on Russian in 
the historical period. Poland, and thus the Polish language, became the primary 
link between Western Europe and Russia, and Polish served as the vehicle for 
introducing a great many lexical borrowings from Latin, French, and Greek at 
this time. French had a direct and profound impact on Russian in the eighteenth 
century, when both written and spoken forms of French provided models for the 
Russian aristocracy. The overall impact of French on Russian is hard to assess, as 
it extends beyond matters of the lexicon and goes into the question of style. 


5 Western European Languages and Slavic 


Several Western European languages have had an impact on the different Slavic 
languages. At varying times throughout history, Latin and French, and Italian to 
a lesser extent, have primarily affected the lexicon of individual languages. 
Germanic contact in prehistoric times was significant, and a number of modern 
Slavic languages are notable for the imprint that sustained contact with German 
has left on their structures: Czech, Upper Sorbian, and Polish. In addition, 
German influence is clear on Polabian (now extinct) and on some dialects of 
Croatian. 


5.1 Czech and German 


Czech—German contact can be established as early as the late ninth century, when 
both Latin and Greek liturgical terminology was borrowed into Czech through 
German. In the twelfth century, German became the court language in the 
Bohemian state and during the thirteenth and fourteenth centuries was the 
dominant language in urban areas due to the immigration of German merchants 
and others. A number of calques and borrowings from both Latin (karta ‘map’; 
figura ‘figure’) and German (hrabé ‘count’; 7ige ‘realm, empire’) date to this time. 
In the eighteenth century, German had high prestige in Czech society, playing a 
significant and dominant role in public life, the schools, and administration. From 
the end of the eighteenth century, the prestige of German only increased as it was 
used in administration and education in the Hapsburg Empire. Up until 1918 and 
the formation of the new Czechoslovak state, German served as the written, 
and often the spoken, language in much of Czech-inhabited territory, especially 
among urban, educated Czechs and many Slovaks (Townsend 1981). 

One of the most frequently cited effects of Czech-German contact is the diphthong- 
ization in Old Czech of the long vowels (i > ou, y > ej) and monophthongization 
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(ie > 1, uo > u) of diphthongs on the one hand, and similar changes in Middle High 
German on the other (u > au, i > ei; ie >i, uo > u). Yet a causal relationship is not 
definitively proven; these may well be parallel developments. There are, however, 
a large number of loanwords, calques, and phraseological translations which have 
clearly entered Czech from German. Some are older borrowings and are found 
in all or nearly all Slavic languages, while others are exclusive to Czech or Slovak 
(e.g. Cz ramsl, Sk ramsl’a < G Ramschel ‘a card game’). It has also been claimed 
that the frequent use of Cz Zadny ‘no, none’ is due to influence from G kein (Schuster- 
Sewc 1996: 16). German has had some impact on the interpretation of aspect in 
calques, as seen in Cz vyznat se v necé, a calque of the G sich in etwas auskennen 
‘to be familiar with something; to know one’s way around something’ (literally, 
‘to know oneself out in something’). In Czech, the verb vyznat’ is morphologic- 
ally perfective, but is used as an imperfective under influence of the German 
calque (Townsend 1981: 7). 

By the late eighteenth and early nineteenth centuries, Czech had been largely 
eclipsed by German and, in an attempt to revitalize the language, a policy of 
re-Slavicizing the lexicon was adopted, a policy largely associated with Josef 
Jungmann’s work. As a result, many borrowings from German (as well as from 
Greek and Latin) were replaced by words from Czech dialects or new lexical items 
based on Slavic roots, such as Cz hudba ‘music’ (versus R, P muzyka) or Cz kni- 
hova ‘library’ (R, P biblioteka), while a number of calques from German using Czech 
roots have survived (zemépis ‘geography’; krasopis ‘calligraphy’). 


5.2 Upper and Lower Sorbian 


Two modern Slavic languages, Upper Sorbian and Lower Sorbian, are surrounded 
by German-speaking territory and lack an autonomous region of their own. 
Upper Sorbs live mostly in southern Lusatia in Saxony and Lower Sorbs in the 
Niederlausitz region; they have long been an ethnic minority in Germany. 
Sorbian monolingualism ceased before World War II; nearly all Upper Sorbian 
speakers are fully bilingual in German and use primarily German; Lower Sorbian 
is seriously endangered. Both languages show significant impact of contact with 
German, change which has been almost entirely unidirectional (Toops 2006). 

A number of calques from German occur in Sorbian and Czech, as in the lit- 
eral translation of G gern haben ‘to like’ (lit. ‘to have gladly’) as Cz mit rad; Upper 
Sorbian rady méc. Changes in the use of the instrumental case can also be ascribed 
to German influence. The pan-Slavic pattern is to make a distinction between the 
instrumental or absolute use of the instrumental case, which does not occur with 
a preposition, and the instrumental of accompaniment, which is used with the 
preposition ‘with’. In Upper Sorbian these two usages are collapsed and are found 
only with the preposition, as seen in the following examples (Létzsch 1996: 56) 
which both use the preposition z ‘with’: Upper Sorbian Ja dzélam z ruku ‘T work 
with my hand’ (instrument); and Upper Sorbian Ja récu z précelom ‘I speak 
with my friend’ (accompaniment). This is one area where Czech has not been 
influenced by German and has maintained the more Slavic pattern (Toops 1999: 
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275) and does not use a preposition: Cz Mluvi chraplavym hlasem ‘He speaks 
with a hoarse voice’. 

Other cases of German influence are less unambiguous. One example is the 
German construction wiirde + infinitive which can function as a (non-past) con- 
ditional or as a future-in-the-past. Upper Sorbian shows an analogous use of the 
(Slavic) conditional construction consisting of by and the ¢-participle as a future- 
in-the-past, a use not found in other Slavic languages. That said, the remainder 
of an iterative preterite in many Sorbian dialects could also be the cause of or, 
more likely, simply the supporter of this change (Toops 2006: 153-4). 


5.3 Polabian 


Polabian was spoken on the left bank of the Elbe River and so was the western- 
most variety of the Lechitic sub-branch of West Slavic. It was subject to heavy 
German contact, or more specifically Low German, from the Middle Ages until 
it became extinct in the mid eighteenth century. The last fluent speaker died in 1756 
(Szydiowska-Ceglowa 1987: 612). Relatively little documentation for Polabian has 
survived, and what there is consists primarily of word lists or lexicons with a few 
texts, for a total of approximately 2,800 lexical items. Some 20 percent of this extant 
lexicon consists of German borrowings which were phonologically assimilated 
and morphologically adapted to Polabian. German also had an impact on 
Polabian morphosyntax. Examples include the development of separable compound 
verbs, using both German and Polabian prefixes and particles, as in Pb to < LGmc 
*to, seen in Pb to-vist (< *to-vesti), alternatively vizé-to (< *vezétv-to) ‘to drive to’; to- 
ziné or ziné-to (< *to-Zenetv) ‘to drive to’; or, using Slavic roots, vanau dojé (< *vonu 
dajet) ‘gives out’ (Polanski 1993: 819). Other examples of German influence are 
the new perfect tense forms of the verbs ‘to be’ (Pb bidit < *byti) and ‘to have’ (Pb 
met < *ometi); and the second person plural pronoun jai (from Middle Low German 
jt). As with other Slavic languages in longstanding contact with German (e.g. Upper 
Sorbian), in Polabian we see the spread of the use of the preposition ‘with’ with 
the instrumental case where other Slavic languages use only the instrumental.* 
Other constructions arguably but less certainly stem from German influence: 
(1) the use of a subject pronoun with what would be impersonal constructions in 
Slavic: ‘it thunders’ Pb gramé (< *gromitv) versus tii gramé, G es donnert and the 
use of ka plus the dative of a verbal substantive, presumably on the model of G 
zu plus the infinitive, as in Pb nemim nic ka vdidoné versus G ich habe nichts auszugeben 
‘T have nothing to give away’ (Polanski 1993: 796). 


5.4 Polish 


By and large the impact of language contact on Polish has been in the area of the 
lexicon, with the exception of Latin, which also had an impact on morphology. 
Two languages stand out in this regard: Czech and German. Czech missionaries 
brought Christianity and the Latin language and alphabet to Poland in 966. 
Thanks to the close cultural and political ties of Czech Bohemia with Poland, Czech 
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had a major influence on the language, in particular in the realm of vocabulary. 
These borrowings come from a large number of semantic fields, such as education, 
science, religion, and cultural life, all reflecting the sense of Czech as the language 
representing the dominant culture, the one to be emulated. Due to German 
colonization, some regions of Polish were German-speaking and some, such as 
Cracow, are known to have even conducted business in German until some point 
in the sixteenth century. But by and large the relationship of the three languages 
— Czech, German, and Polish — is so intertwined that it is often difficult to deter- 
mine whether a borrowing came directly from German or came into Polish via 
Czech. A significant number of German borrowings are, not unexpectedly, in the 
realm of government, commerce, and town planning, such as burmistrz ‘mayor’; 
borg ‘credit’ or gaska ‘street’. (See Schenker 1985: 198-9 for examples of both Czech 
and German borrowings.) 

Despite the long dominance of Latin in religious spheres, it did not have a real 
impact on Polish until the time of the Renaissance, when it came to be viewed as 
the language of high culture and was acquired by the nobility as a means of 
communication. Latin influence is seen in modern Polish in the phonology of a 
few borrowed words which have stress on the antepenultimate syllable, not the 
penultimate, as is the norm (e.g. muzyka ‘music’ or publika ‘public’). In morphology 
it is seen in the declension type of neuter nouns in -um (liceum ‘high school’); the 
nominative plural -a of some masculine nouns (koszta ‘costs’); the prefixes arcy-, 
super-; and the derivational suffixes -acja, -ysta/-ista, -yzm/-izm. In addition, a num- 
ber of fixed phrases follow noun-adjective order (e.g. wojna domowa ‘civil war’), 
again from Latin. A large number of lexical borrowings and calques also came 
from Latin. Later, in the late eighteenth century and into the nineteenth century, 
a number of borrowings came from French; these can generally be traced to the 
strong influence of court life at Versailles and include such words as dama ‘lady’; 
kotylion ‘cotillion’; and krawat ‘tie’, to name just a few (see Schenker 1985). 

Silesian (also called Upper Silesian or, pejoratively, Wasserpolnisch ‘watered-down 
Polish’), is worthy of special mention. It is spoken in the region of Upper Silesia 
in territory that extends from Poland into the Czech Republic. Its status as a 
separate language or a dialect of Polish has been debated. Structurally it is very 
similar to Polish in terms of phonology and morphology but with some differ- 
ences, including the loss of nasal vowels. Moreover, it shows greater influence in 
particular of Czech, especially in the lexicon and syntax,’ and also of German and 
Slovak (see Hannan 1996). 


6 Slavic Languages in Contact 


6.1 Czech and Slovak 


The Czech language was codified for centuries prior to Slovak and so was well 
positioned, linguistically and sociopolitically, to have a significant impact on it. In 
fact Czech functioned as the language of literature and culture in Slovak-speaking 
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regions during the fifteenth to nineteenth centuries, continuing up to the end of 
the twentieth century in some groups. The notion of a unified “Czechoslovak” 
emerged during the eighteenth century in the Slovak Protestant environment. Czech 
was seen as the standard literary language for both Czechs and Slovaks but took 
on a number of Slovak features in the Slovak-speaking communities, and for some 
protestants was even understood to be a synthesis of Czech and Slovak. Even after 
standardization of Slovak in 1843 by L’udovit Stur, “Czechoslovak,” or biblictina 
‘Biblical language’ as it was frequently called, continued to be the liturgical 
language for Slovak Protestants, and retained some of its functions until the end 
of the twentieth century. Ultimately the result was diglossia, with Czech used as 
the liturgical and sacred language and Slovak in all secular domains. The tendency 
toward merging Slovak with Czech only increased in the first Czech Republic 
(1918-39), where official language policy moved the two toward unification. The 
overall result was a destabilization of standard Slovak and a tendency for speakers 
to mix varieties (Nabélkova 2007). 


6.2 East Slavic I: Russian, Ukrainian, and Belarusian 


Russian, Ukrainian, and Belarusian have long been in contact and have mutually 
influenced one another. The East Slavic languages constitute a language—dialect 
continuum, and the boundaries which define “languages” as opposed to “dialects” 
are more political than linguistic. (The same is true in the West, and the border 
region between Belarus and Poland is a transitional zone, linguistically and cul- 
turally.) It is clear that an East Slavic variety, distinct from other varieties of Slavic, 
emerged during the seventh to eighth centuries but it is impossible to pinpoint the 
time when the three East Slavic varieties could be considered discrete languages 
and a pan-East Slavic that was mutually intelligible ceased to exist. The use of 
Church Slavonic as the written language for the region into the fifteenth century 
further complicates the picture. Finally, although the impact of Russian on 
Ukrainian and Belarusian predates the Soviet period, it is important to note that 
all three were spoken in territory within the former Soviet Union. Language poli- 
cies aimed at assimilation to Russian as well as the social, political, and economic 
dominance of Russian, have certainly had an impact on the direction of change. 

Two mixed languages have been identified as the result of language contact 
among the Slavs: Trasjanka, a mix of Belarusian and Russian, and Surzhyk, a mix 
of Ukrainian and Russian. Both terms are pejorative and their literal meanings 
referring to a lower grade grain: trasjanka refers to a mixture of wheat and straw, 
and surzhyk of wheat and rye. Further research, based on large corpora of naturally 
occurring connected discourse in both mixed varieties, is needed to determine just 
how regular such mixtures are. 


6.3 East Slavic I: Russian, Ukrainian, and Surzhyk 


Since Ukraine became a part of Russia with the Treaty of Perejaslav in 1654, there 
has been ongoing contact between the Ukrainian and Russian languages, a contact 
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shaped by varying social and political pressures over time. In the seventeenth and 
early eighteenth centuries the language influence was bidirectional and mutual. 
During this time period, Ukrainian had an impact on Russian in both religious 
and secular circles, serving as the intermediary for loanwords, in particular from 
Polish. Ukrainian clergy held high positions in Moscow, and so Ukrainian pro- 
nunciation had an impact on Church Slavonic, and Ukrainian grammatical and 
rhetorical traditions were adapted to the liturgy. But by the end of the eighteenth 
century, the influence of Russian, and in particular the Russified variant of 
Church Slavonic, was so strong that it began penetrating all parts of Slavic ortho- 
doxy, including even Transcarpathia and Bukovyna. Russification of Ukraine 
intensified during the periods of the Russian Empire and the Soviet Union. By 
the end of the Soviet era, it is possible to speak of diglossia in Ukraine, with Russian 
as the High variety used in formal, administrative, and educational domains, and 
Ukrainian is less formal, home settings. 

One result of this diglossia is Surzhyk, a ‘hybrid sociolect’ (Taranenko 2007: 125) 
with Ukrainian as a matrix language and certain inserted Russian features. More 
research is needed to determine to what extent the mixture of Russian and 
Ukrainian found in Surzhyk is regular and predictable and to what extent it is 
idiosyncratic, although we can identify a major division between those speakers 
who use fused-lect Surzhyk as their native tongue and are not fluent in any other 
language variety, and those who mix Ukrainian and Russian due to incomplete 
knowledge of just one of them, and fluency in the other. In addition, there are 
speakers who know both languages, but who mix them in speech because that is 
perceived to be the norm for their speech community, or at least the norm in a 
given setting (Bilaniuk 2004). 

Many studies to date have had a prescriptive focus, arguing against language 
mixture. The most comprehensive study so far is Bilaniuk (2005) which looks 
primarily at Eastern Ukraine; differences might be found in the western, less 
Russified regions. The linguistic characteristics of Surzhyk include the following 
(Flier 2000): 


1 Borrowings and lexical calques of Russian words not normally found in 
Ukrainian (e.g. R nakonec, Sur nakonic’ versus Ukr naresti ‘finally’; R/Sur 
pokupateli, Ukr pokupci ‘shoppers’; or R stolovaja, Sur stolova versus Ukr idal’nja 
‘dining room’, where the Surzyk adapts the Russian substantized adjective 
to Surzhyk/Ukrainian adjectival morphology). There appear to be no constraints 
on lexical borrowing and calques. 

2 Syntactic calques, primarily in prepositional and numeral phrases, as well as 
head phrases, where government patterns differ between Ukrainian and 
Russian, as in R sovescanie po problemam, Sur narada po problemam ‘a conference 
on issues’, using the preposition po ‘on’, ‘concerning’ with the dative case, ver- 
sus Ukr narada z problem, using z plus the genitive. (Use of the preposition po 
does not otherwise govern the dative in Ukrainian.) Another example is in 
time expressions, following the Russian model of the preposition v ‘at’ plus 
the accusative of a cardinal numeral which itself then governs the genitive 
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(R v desjat’ casov, Sur v desjat’ godyn ‘at ten o'clock’) versus the Ukrainian 
pattern using o plus the locative case of a cardinal numeral and noun (Ukr o 
desjatij godynit); 

3 Pronouns and adjectives follow Ukrainian morphology. In nouns, the voca- 
tive follows the Russian pattern of being identical to the nominative, and the 
genitive singular of masculine nouns is generally -a, not -u, unlike Ukrainian, 
where the reverse is true. The dative singular of masculine animate nouns also 
typically follows the Russian -u, as opposed to the Ukrainian form in -ovi. 

4 Verb paradigms largely follow Ukrainian forms and the third person non-past 
ending in t is not replaced by Russian nonpalatalized -t; the preterite uses the 
Ukrainian ending -v and not the Russian ending -l; but Surzhyk tends to form 
a first person plural imperative using the non-past first person form on the 
Russian model. 


Surzhyk cannot be considered a single, homogeneous linguistic variety but is 
rather a set of varieties, with differences depending on several parameters: rural 
versus urban; level of education; and time period (pre-Soviet, Soviet, post-Soviet). 
These have been analyzed as five different categories (Bilaniuk 2004; 2005) but 
are probably more accurately seen as different sociolects which have changed over 
time but can be united in one larger category of Surzhyk. 


6.4 East Slavic II: Russian, Belarusian, and Trasjanka 


Over the course of history Belarusian has been spoken in a region which has been 
under a variety of different political controls and, subsequently, under different 
linguistic influences and pressures. Linguistically Belarusian is close to both 
Russian and Polish and can seen as transitional between the two. The East Slavic 
predecessors of the Belarusians moved into the territory north of what is now 
Ukraine and east of Russia; in the tenth to eleventh centuries the region became 
part of Kievan Rus. The Mongol invasion of Rus in the thirteenth century did not 
extend to their territory, thus politically (and linguistically) separating them from 
their East Slavic neighbors, to be incorporated into the Grand Duchy of Lithuania 
in the fourteenth century. An estimated two thirds of the population of the 
Grand Duchy was Slavic and the local version of Church Slavonic was adopted 
as the official language. It was called Rusky during this period, although Russians 
in Moscow referred to it as Lithuanian, while in Ukraine the written form was seen 
as Russian and the colloquial (spoken) language as Lithuanian. In both cases the term 
“Lithuanian” refers to the territory, not the language, yet its use is symptomatic 
of the overall social situation. 

With the Union of Lublin treaty in 1569, Belarus became part of the Polish— 
Lithuanian Commonwealth, Latin and Polish were used as official administrative 
languages, and the nobility became increasingly polonized. In 1697 use of Belarusian 
was Officially banned; Polish became the official language. Immigration of Polish 
gentry to the region resulted in the emergence of a variety known as polszczyzna 
kresowa ‘borderland Polish’, Polish with a significant Belarusian substrate. During 
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this time the greatest number of Polish loanwords entered the language. These 
fall into a number of different semantic categories, ranging from everyday 
vocabulary to political and military terminology, such as vjandlina ‘ham’, kséndz 
‘prince’, or zbroja ‘weapons’. With the partitions of Poland (1772-95), the region 
was reincorporated into the Russian empire. An uprising in January1831 against 
tsarist rule resulted in restriction on the activities and influence of the Catholic 
Church and restrictions on the use of the vernacular. The name of the region was 
officially changed from Belorussija to Severo-zapadnyj kraj ‘Northwestern region’ 
and Russian replaced Polish in all public spheres, including education, govern- 
ment, and the courts. Russian loanwords from the nineteenth century reflect the 
political climate at the time: ssylka ‘exile’ and perevarot ‘revolution’. Russification 
only intensified during the Soviet period and loanwords from all domains 
entered the lexicon, although the influx of technical, political, and scientific ter- 
minology is particularly noteworthy. 

Belarusian has long existed under the shadow of Russian and is seen by many, 
even today, as a substandard form of CSR. In fact, in western and northwestern 
Belarus, Belarusians themselves have considered Russian to be a standardized 
variety of their own language, not a separate language (Gustavsson 1997: 1922). 
Trasjanka thus stems from a combination of the intense Russification policies, the 
prestige of Russian, and the linguistic similarity of Belarusian and Russian. In 
current politicized, nationalist discourse in Belarus, the term “Trasjanka” is often 
used to refer to any kind of speech which in some way deviates from standardized 
Belarusian or Russian languages. For linguists the term refers specifically to a mixed 
language that combines elements of both languages. As with Surzhyk, there is 
some disagreement as to how regular the combinations are and to what extent 
they are idiosyncratic, and the amount of Russianisms may vary from speaker to 
speaker. Trasjanka typically has Belarusian phonetics and intonation, a mixed mor- 
phology and mixed lexicon. It is possible to find examples where the preposition 
is Belarusian and the nominal morphology Russian, as in ab cheloveke ‘about the 
person’. This mixture is illustrated in the following example, where the Russian 
elements are underlined and the Belarusian in bold face: 


(1) Scas pagljazu iakie sapozki pradajuc’ 
now I'll look what boots are selling 
‘I'll take a look to see which boots are for sale’ 


Phonetically there are Belarusian elements, such as the pronunciation of pagljazu 
(versus a more R pagljizu) with lexical elements from both. The verb pradajuc’ could 
be found in either language, but the final consonant (part of the third person 
plural suffix) -c’ is BR, instead of the expected R -f. Although Trasjanka is typic- 
ally defined as a jargon or as being limited to casual speech, it is widespread in 
spoken speech of all levels; Gustavsson (1997) claims that there are few people 
who can speak without some admixture of Russian. It also occurs in written form, 
in particular in texts purporting to report actual speech. 
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7 Conclusion 


This chapter has provided a brief survey of some of the key contact phenomena 
in the history and development of the Slavic languages. It has not touched upon the 
question of the impact of Slavic on other, non-Slavic languages. Although I have 
provided only the briefest of overviews of some of the modern Slavic languages, 
it is fair to say that there is no Slavic language which has been unaffected by 
contact. In fact, many of the contact effects date to prehistoric times and the 
structure of modern Slavic languages can only be understood with some under- 
standing of earlier and current contact. 


NOTES 


1 The following abbreviations are used here for these and other languages with which 
they come into contact: BCS = Bosnian-Croatian-Serbian; BR = Belarusian; CSR = 
Contemporary Standard Russian; Cz = Czech; ESI = East Slavic; G = German; Gk = Greek; 
Goth = Gothic; LGmc = Late Germanic; OCS = Old Church Slavonic; P = Polish; Pb = 
Polabian; PGmc = Proto-Germanic; R = Russian; Sk = Slovak; Sur = Surzhyk; Ukr = 
Ukrainian. 

2  Rusyn is alternatively referred to as Ruthenian, Carpatho-Rusyn, or Rusnak. The Rusyn 
population lives in Subcarpathian Ukraine, in the Lemko region of Poland, in the Presov 
region of Slovakia, in Vojvodina in Serbia, and into Croatia (Variko 2007). 

3 See Vlasto (1988: 10-23) for a succinct survey of the differences in phonology and 
morphology. 

4 This has similarly been noted in Croatian dialects spoken in Italy, e.g. Molisean Croatian 
under Italian influence s nozZem, Italian con un cotello ‘with a knife’ (Breu 1996: 26). 

5 See the online dictionary Silesian—Polish (Stownki Slaski) at http://www.slownik_ 
slaski.itatis.pl. 
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29 Contact and the Finno- 


Uegric Languages 


JOHANNA LAAKSO 


1 Introduction: The Finno-Ugric/Uralic 


Language Family 


The Uralic language family, consisting of 20-40 languages spoken in eastern Europe 
and western Siberia, is traditionally described in terms of a binary family tree, 
starting from the first binary split of Proto-Uralic into Proto-Finno-Ugric and Proto- 
Samoyedic. However, this model is not unanimously accepted. In the alternative, 
more “bush-like” models, Samoyedic is simply one of the three or more main 
branches, which means that the terms Uralic and Finno-Ugric can be used as 
synonyms. This practice is adopted also in this chapter. 


The Finno-Ugric/Uralic language family, irrespectively of the structure of the 


postulated family tree, consists of six main branches: 


1 


The Finnic-Saami branch. The Finnic (aka “Baltic Finnic” or “Fennic”) languages 
include two nation-state languages (Finnish with ca. 5 million speakers, Estonian 
with ca. 1 million speakers) and five to six minority languages spoken in Latvia 
(the almost extinct Livonian) and northwest Russia (Karelian, Veps, Ingrian, 
Vote; of these Karelian has some 60,000 speakers, albeit divided between three 
to four deeply different main dialects, while Veps is only spoken by ca. 6,000 
people; Vote and Ingrian are obviously facing extinction). The Finnish varieties 
in northern Sweden (Tornedal Finnish also known as Mednkieli ‘our language’; 
Winsa 1998) and Norway (Kven) and the southeastern variety of Estonian (Véro- 
Seto) have recently begun to develop their own standard literary languages. 
The Saami (Sami, ‘Lapp’) languages form a long dialect continuum, divisible 
into six to ten languages, stretching from Sweden and Norway through Finnish 
Lapland to the Kola Peninsula in northwest Russia. The greatest Saami 
language, Northern Saami, spoken in Finland, Sweden, and Norway, has some 
30,000 speakers; other Saami varieties are more or less seriously endangered. 
Mordvin, spoken in European Russia by more than 600,000’ people. These are 
divided into (at least) two ethnic groups, Erzya and Moksha, both having a 
standard language of their own. 
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3 Mari (aka Cheremis), spoken in European Russia by ca. 500,000 people. There 
are two standard language varieties, East (Meadow) and West (Hill) Mari. Mari 
and Mordvin are sometimes bundled together as the “Volgaic branch,” but 
there seems to be no linguistic evidence for “Proto-Volgaic.” 

4 The Permic branch includes two or three languages of European Russia: Komi 
(also known as Zyryan), with two standardized varieties, Komi-Zyryan and 
Komi-Permyak and more than 300,000 speakers in sum, and Udmutt (aka 
Votyak), with more than 450,000 speakers. 

5 The Ugric branch is the most heterogeneous subgroup, as the relatedness between 
Hungarian (the greatest Finno-Ugric language with 13-15 million speakers in 
Hungary, adjacent areas, and other countries) and the two Ob-Ugric languages 
in western Siberia is rather distant. The Ob-Ugric languages, Mansi (aka Vogul) 
and Khanty (also known as Ostyak), with a few thousand speakers each, are 
severely endangered. The dialectal divisions within the Ob-Ugric languages are 
deep, Eastern Khanty in particular might be considered a separate language. 

6 The Samoyed branch now includes four languages spoken in western Siberia: 
Nenets (aka Yurak, with two deeply different varieties: Tundra and Forest 
Nenets), Enets (aka Yenisey Samoyed), Nganasan (aka Tawgy Samoyed), 
and Selkup (also known as Ostyak Samoyed). Tundra Nenets has ca. 30,000 
speakers, while all other Samoyed languages are seriously endangered or almost 
extinct. Some extinct Samoyed languages are also known; the most extensively 
documented is Kamass, the last speaker of which died in 1989. 


According to most mainstream researchers, Proto-Uralic was a contemporary (and, 
perhaps, neighbor) of Proto-Indo-European (for the research on early contacts, see 
in particular Carpelan, Parpola, and Koskikallio 2001), and the relatedness between 
the subgroups of Uralic is thus comparable to that between the branches of Indo- 
European. Due to this high genetic diversity, as well as the vast differences between 
the ecological and sociopolitical environments of today’s Finno-Ugric languages, 
an exhaustive description of all language contact situations involving Finno- 
Ugric is an impossible task. In what follows, I will merely present a quick survey 
on the diversity of relevant language contacts, followed by some exemplary 
cases. In particular, I would like to draw attention on the rich Finno-Ugristic research 
tradition which, being published in “less accessible” languages such as German, 
Russian, Hungarian, or Finnish, is often overlooked in today’s linguistic research. 


2 Language Contact Situations Involving 
Finno-Ugric: An Overview 


2.1 Finno-Ugric minorities and the majority language 


Most Finno-Ugric languages of today — i.e. all except Hungarian, Finnish, and 
Estonian in their nation-states — are endangered minority languages that are mainly 
used among family members and friends or in small communities, in connection 
with the traditional way of living or the ethnic heritage in general. Practically all 
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speakers of these languages are bilingual, and for most of them, in particular for 
younger speakers, the majority language is the dominant or the sole language 
of higher education (or even all education), professional communication, literary 
culture, media, and urban life in general. This applies for the Finno-Ugric minorit- 
ies in Russia and also for the Saami in general and for the last Livonians in Latvia. 
It is also true of the Finnish minorities in Sweden and Norway, of the remainders 
of post-World War II Estonian and Hungarian emigrant groups west of the former 
Iron Curtain, and, to some extent, of at least some the old Hungarian minorities 
in the neighbor states of Hungary. 

Of course, the degree of endangerment and the disruption of language trans- 
mission varies greatly. In some traditionally Hungarian-speaking areas outside 
Hungary, Hungarian can still be the majority language, used in professional 
communication, media, primary schools, and higher education. The Finno-Ugric 
peoples in Russia, in contrast, are a minority even in all their titular republics 
and “autonomous” areas, and despite language laws and institutions (such as 
national theaters, museums, publications, and research institutes) the presence of 
these languages in urban life, in the media, or in the education system is often 
very marginal. 

For the Finno-Ugric minorities in Russia, the dominant majority language 
today is Russian. However, there are clear differences in the age and depth of the 
impact of Russian: In the Volga region, the cultural and political dominance of 
Turkic-speaking peoples only gradually gave way to Russian from the sixteenth 
century on, and for the Finno-Ugrians of this region (Mari, Udmurt, to some extent 
also Mordvin), the most important contact language even until the twentieth 
century could still be Tatar, Chuvash, or Bashkir. Siberia was colonized only after 
the Middle Ages, and the dominance of Russian administration and culture 
remained rather superficial until the twentieth century; instead, there were con- 
tacts between the indigenous Uralic and non-Uralic (Yukaghir, Yeniseic, Turkic, 
Tungusic) languages of Siberia. 

In contrast, the eastern Finnic peoples (Karelians, Veps, Votes, Ingrians), in the 
vicinity of Novgorod, belonged to the core area of emerging Russian nationhood 
already in the Middle Ages. Also the Komi in northern Russia (see e.g. Leinonen 
2002) and, to some extent, the Mordvin were incorporated by the Russian state 
early on. This long history of close contacts with Russian manifests itself in a plethora 
of loanwords, parallel structural developments, and even Sprachbund-like phe- 
nomena such as those between northwest Russian dialects and eastern Finnic 
(cf. Sarhimaa 1992; 1999; Helimski 2003). 


2.2. “Western” Finno-Ugric languages and 
their neighbors 


From the Middle Ages on, the Finnic and Saami languages — already deeply marked 
by intensive contacts with (Pre-)Germanic and (Pre-)Baltic and also showing 
some traces of early Slavic contacts — were divided by the great border between 
East and West, also the border line between eastern and western Christianity and 
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coinciding with the western border of the emerging Russian empire. West of 
this boundary, the politically dominant languages were Scandinavian (Swedish 
or Norwegian) or, south of the Gulf of Finland, German. Lapland outside the 
Russian power sphere was gradually divided between the states of Sweden and 
Norway, while today’s Finland arose from the part of the Finnic area which was 
annexed to Sweden. 

The border between Sweden and Russia divided the daughter dialects of 
Proto-Karelian into today’s Karelian and East Finnish and joined the East Finnish 
dialects with the West Finnish dialects, thus gradually weakening the originally 
deep dialect border between East and West Finnish. In this area later known by 
the name of Finland, Swedish (beside and after Latin) became the dominant 
language of administration, higher education (although elementary literacy in 
Finnish was promoted by the Reformation from the sixteenth century on), and 
higher social strata. From the seventeenth to the nineteenth century, Finnish was 
the language of peasants and servants, and upward social mobility inevitably led 
to language shift. This superstrate influence resulted in hundreds of Swedish loan- 
words, loan translations, and other influences (cf. de Smit 2006). Partly hidden 
under the Swedish influences, there are also elements from West European cul- 
ture languages in Finnish, such as Latin (whether these words were conveyed by 
Swedish or borrowed directly cannot always be unambiguously assessed) or German 
(Low German loanwords in particular are sometimes indistinguishable from the 
Swedish ones, cf. Bentlin 2008). 

In Estonia, the situation was largely a mirror image of that in Finland: The 
dominant languge in the colonized Baltic countries until the nineteenth century 
was German (Low German, later ousted by High German as in the northern parts 
of the German-speaking area in general; cf. Hinderling 1981), Estonian being mainly 
the language of serfs and servants, cultivated since the Reformation only sparsely 
for the goals of the Lutheran church and elementary education. According to the 
often-quoted statistics of Ratsep (1983), 24 percent of the underived words in 
Standard Estonian (modern internationalisms excluded) are of either Low German 
or High German origin. There were also influences from Swedish (sometimes 
difficult to distinguish from the ample Low German elements; cf. Raag 1987; 1997), 
Latvian (cf. Vaba 1977), and other languages. Livonian, since the Middle Ages 
rapidly giving way to Latvian, was deeply marked by the Latvian language 
(Suhonen 1973), superseded by the dominant German. 

The impact of Russian on Finnish and Estonian was clearly weaker than that 
of Swedish or German. However, quite a few Russian loanwords were adopted 
to the easternmost dialects (for Estonian, see Must 2000) and, in Estonia in par- 
ticular, from the language of Russian administration in the eighteenth and nine- 
teenth centuries. Some of these words, sometimes spreading by way of gradual 
diffusion and intertwining with native descriptive words (Jarva 2001), made their 
way into the standard languages (Pléger 1973; Blokland 2005). 

Unlike most of their linguistic relatives, the linguistic ancestors of the 
Hungarians, adopting a mobile, half-nomad way of life, passed through various 
language contact situations, partly reconstructible only on the basis of early 
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loanword strata. Contacting with Iranic (cf. Korenchy 1988) and Turkic (Réna- 
Tas 1988) languages, perhaps also with proto-forms of the distantly related 
Permic languages (Rédei 1964b), they wandered across the steppe zone north of 
the Black Sea and finally, toward the end of the first millennium CE to the 
Carpathian basin, ousting or assimilating Slavic speakers (cf. Kniezsa 1955; Décsy 
1988). In addition to Slavic and Romanian minorities and neighbors as well as 
Turkic-speaking groups, Hungarian had intensive contacts with German — also 
the language of diverse minority groups in historical Hungary and, later on, 
a dominant language of culture and education — and other languages. The 
influence of Latin, the language of administration and education in Hungary even 
until the nineteenth century, was remarkably strong. 

Prompted by Romantic Nationalism from the nineteenth century on, all three 
Finno-Ugric nation-state languages experienced an intensive phase of puristic 
language planning often explicitly attempting to reverse contact-induced change. 
In addition to lexical purism resulting in numerous native-based neologisms, 
there were attempts to get rid of identifiable foreign influences in phonology and 
orthography or, for instance, word order models (such as verb-final subordinated 
clauses in Estonian) or word-formation strategies (such as the adjectives in -rikas 
‘rich’ or -vapaa ‘free’ in Finnish, mirroring Swedish adjectives in -rik or -fri) which 
were perceived as foreign. 

In all three Finno-Ugric nation-states, there are considerable language minor- 
ities. Hungarian in particular played a role uniquely dominant among the Uralic 
languages, vis-a-vis the numerous German-speaking, Slavic, and Romanian 
minorities of historical Hungary until the end of the Austro-Hungarian empire. 
In post-World War I Hungary, which lost two thirds of its area to the new neigh- 
bor states in the peace treaty of Trianon (simultaneously, one third of ethnic 
Hungarians became minorities in the new neighbor countries), the numbers of 
remaining minorities, now ranging from a few thousand (Slavic, Romanian) to 
more than 30,000 (German) or some 50,000 speakers (Romani), have generally been 
receding, while the most numerous minority, the Roma, suffers from stigmatiza- 
tion and social problems. 

Estonia lost most of her “old” minorities (Baltic Germans, Swedes, Jews, and 
Roma, part of the “old” Russian minority) in connection with World War II, but 
the massive immigration from other parts of the Soviet Union in the post-war 
decades created a new, large, and mostly Russian-speaking minority. During the 
Soviet period, very little was done to promote knowledge of Estonian among this 
minority, but now it is increasingly exposed to Estonian, the only official state 
language since the restoration of independence in 1991. 

In Finland, the Finland variety of Swedish, spoken traditionally in southern 
and western coastal areas, is an official national language beside Finnish and has 
a strong institutional basis. However, Finland Swedish is only spoken by less 
than 6 percent of Finland’s population. It is receding especially in the originally 
Swedish-speaking Helsinki region and subject to the triple pressure of the 
Finnish majority, the Sweden-Swedish standard, and global English (Saari 2000; 
Ostman 2006). Beside other “old” minority languages (three Saami varieties, 
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Romani, etc.), there are languages spoken by rapidly growing immigrant groups 
in Finland since the 1980s and 1990s, such as Russian and Estonian. 


2.3. Contact within Finno-Ugric 


With the exception of the quasi-isolate Hungarian, most Finno-Ugric languages 
have (or have until recently had) contacts with related languages, sometimes result- 
ing in complicated, Sprachbund-like networks, in which not only lexical borrowing 
but also transmission of morphological elements is possible. Within the Finnic group 
in particular, the internal contacts make it practically impossible to describe the 
genetic relatedness between the sister languages in terms of traditional family trees. 
Contacts between Finnic (in particular, Finnish and Karelian) and Saami have 
also been intensive, to such an extent that there have been attempts to replace 
the traditional “intermediate” Finnic-Saamic protolanguage (Early Proto-Finnic) 
with a model of more distantly related Finnic, Saamic, and other “contact blocs” 
(Itkonen 1997). However, Proto-Finnic-Saamic is unambiguously reconstructible, 
as shown by Korhonen (1981), and Sammallahti (1999) presents convincing 
counter-arguments to Itkonen’s mainly lexically founded hypothesis. 

In Siberia, similarly close relationships have developed between different vari- 
eties of Mansi and Khanty (Honti 1998: 352), the speakers of which were cultur- 
ally close to each other and sometimes connected by interethnic social networks. 
Of the Permic languages, Udmurt belongs to the same “Volgaic” cultural sphere 
as Mari and Mordvin, characterized by intensive contacts with Turkic, and as 
Mari and Mordvin in particular are spoken in scattered language islets across the 
whole Volga region, there are local contacts between individual varieties of these 
languages (see e.g. Bereczki 2007). Komi, on the other hand, bears traces of contacts 
with Finnic or “Para-Finnic” (Northwest-Finno-Ugric) languages probably spoken 
in today’s northern Russia before the East Slavic expansion; there are some good 
loanword etymologies, but the hypotheses cautiously formulated by Hausenberg 
(1998) about the possible role of Finnic contacts for the morphosyntactic diver- 
gence of Komi from its sister language Udmurt still call for further research, also 
in the light of the developing substrate language research (cf. section 3.2 below). 
The Komi also came into contact with the westward-spreading Nenets in the utmost 
northeast of European Russia (Wichmann 1902; Rédei 1962), and as merchants, 
middlemen, and colonists in western Siberia, with the Mansi and Khanty (Rédei 
1964a; 1970). 


2.4 Typical(?) outcomes of language contact 


The comparative linguistic study of Finno-Ugric languages is characterized by a 
strong tradition of etymology and loanword research. In addition to etymolo- 
gical dictionaries (mainly of Finnish and Hungarian), there are numerous studies 
of specific loanword strata, also in the minor Finno-Ugric languages. Contacts with 
Indo-European, especially (Pre-)Germanic loanwords in Finnic (or Northwestern 
Finno-Usgric), have been the subject of particularly intensive investigations; in 
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Hungary, there is a corresponding research tradition dealing with the abundant 
Turkic loanwords in Hungarian. In contrast, research into contact-induced “Indo- 
Europeanization” (or “Turkicization”) in other subsystems of language than the 
lexicon has been much less intensive, and there are controversies symptomatic of 
fundamental problems of language contact research; some of these will be dealt 
with in more detail in what follows. 

In the most intensive contact situations, even adoption of morphological ele- 
ments, such as Latvian verbal prefixes for aspectual or Aktionsart meanings in 
Livonian (or, correspondingly, Russian prefixes in Karelian; Kiefer & Honti 
2003), has been attested. In contacts within Finno-Ugric, examples of transmis- 
sion of inflectional or derivational morphology are known at least between 
Finnic and Saami. In syntax, a classic example of borrowed elements triggering 
deep-going structural changes is the borrowing of conjunctions: in many Finno- 
Ugric languages of Russia, conjunctions borrowed from Russian have replaced 
inherited means of expressing causal, temporal, etc. relations with converbs or 
other nonfinite verb forms (for an interesting case study in Khanty, see Csepregi 
1997; cf. Thomason 2001: 62). 

There are also examples of systematic adoption of phonological features and 
phonotactic constraints: Already in the early twentieth century, younger Livonian 
speakers substituted the ii and 6 vowels (unknown in Latvian) with i and e. Livonian 
also has taken over the accent/intonation system of Latvian, including the often- 
mentioned stad. The aspiration of voiceless word-initial stops as in Scandinavian 
has spread from Swedish (and Norwegian) to the language of the old Finnish 
minority in northern Sweden and to some Saami varieties. 

In some cases, the exposure of the Finno-Ugrians to the majority or culturally 
dominant language was weak enough to allow for the development of a 
pidginized variety: examples include Govorka or the Taimyr Peninsula Pidgin 
Russian in northernmost Siberia (Wurm 1996), Halbdeutsch or pidginized German 
spoken by uneducated Estonians until the nineteenth century (Lehiste 1965), 
and the poorly documented gavppe-daro (“Trade Norwegian”) and borgarmialet 
(pidginized Swedish) spoken by some Saami groups in the eighteenth or nine- 
teenth century (Jahr 1996). 

Compared with the research on contact-conditioned changes in Finno-Usgric, 
there is much less literature on the possible impact of Finno-Ugric languages on 
other languages — with the exception of substrate studies (see section 3.2 below) 
and the study of majority-language impact on the minority languages in the Finno- 
Ugric nation-states, such as the Fennisms in Finland Swedish (Saari 2000) or Finnish 
Romani (Borin & Vuorela 1998). This obvious imbalance has many reasons. First, 
it may go back to factual power and prestige relations in the contact situation. 
In today’s Russia, for instance, practically all Finno-Ugric minority speakers are 
bilingual in Russian, while the knowledge of Finno-Ugric languages among the 
Russian majority is very rare. Even in areas where the contacts between Russians 
and Finno-Ugrians have traditionally been intensive and more balanced (such as 
the contacts between northwest Russian dialects and Karelian or Veps), the 
Finno-Ugric influences are often restricted to local substandard dialects; there are 
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only a few Finnic loanwords that have managed to spread to Standard Russian. 
Second, the non-Finno-Ugric contact partner may not be accessible to research any 
more. It is generally assumed, for instance, that the numerous (Pre-)Baltic and 
(Pre-)Germanic loanwords in Finnic were at least partly acquired from Indo- 
European (IE) speaker populations who lived among the Proto-Finnic speakers 
in present-day Finland (and Estonia?) and were later assimilated, so that their IE 
language, probably marked by contacts with Finnic, did not survive. Third, there 
may simply not be enough interest and expertise: Kallio (2000b), presenting a Finnic 
etymology for Germanic *mapon (> English moth; the Finnic word, in turn, could 
be an Indo-Iranic loan), suspects that more loanwords from Finn(o-Ugr)ic into IE 
could be found by systematic searches. 


3 Some Central Questions 


3.1 Contact or relatedness? On the earliest 
loanword strata 


The well-known similarity of certain Uralic basic vocabulary items to IE, such as 
Proto-Uralic (PU) *nimi ‘name’, *weti ‘water’ or *ku- ‘who’, has been interpreted 
in many ways. While most Finno-Ugrists are very wary of categorical statements 
in this question, there have been more or less cautiously formulated versions of 
the “Indo-Uralic” hypothesis, and Helimski (2001) prefers to regard these words 
as evidence for a Nostratic relatedness. Others have attempted to explain all 
similarities in terms of very early borrowing; the most prominent representative 
of this approach is the Finnish Indo-Europeanist Jorma Koivulehto (for a synthesis 
of his work, see Koivulehto 1999; 2001a; for sympathizing views, see also e.g. Anttila 
2000; Kallio 2002). 

Koivulehto began by discovering very early Germanic loanwords in Finnic (and, 
partly, in the neighboring branches of Finno-Ugric as well) and thus revolution- 
izing the established chronologies of Finnic-Germanic contacts. Delving deeper 
into Pre-Germanic and Northwest Indo-European, he found more and more 
loanwords representing an even more archaic, practically Proto-Indo-European 
(PIE) level of reconstruction — for instance, reflexes of PIE laryngeals in Finnic 
(Finnish kaski ‘slash-and-burn’ < PIE *Hazg- ‘ashes’) or even in Uralic. Inspired 
by the reconstruction of Proto-Uralic phonology by Janhunen and Sammallahti 
(Sammallahti 1988), which introduced a mystery consonant *x, functionally not 
unlike the PIE laryngeals, Koivulehto discovered cases of Uralic *x substituting 
a PIE laryngeal, such as PU “néxi ‘woman’ < PIE *g“neH-, or PU “tuxli ‘wind’ 
< PIE *d"uH-li-. All of these loan etymologies have not been unanimously 
accepted, and the criticism highlights some central problems in the research of recon- 
structed language contact. 

While acknowledging the technical brilliance of Koivulehto’s etymologies, his 
critics have pointed out that he sometimes operates with IE roots not attested firmly 
enough or extended with suffixes of a questionable status (Helimski 2001), that 
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he postulates semantic shifts difficult to motivate (e.g. ‘put into motion’ > ‘row’, 
‘pour (a libation?)’ > ‘drink’), that it is not realistic to assume such a great number 
of loanwords in the basic vocabulary (Rédei 2002) and that, in general, he seems 
to over-exploit his model, relying on sound substitutions (cf. Abondolo 1998a: 7; 
Janhunen 1999). The great differences between the consonant systems of PIE and 
PU make it theoretically possible to find many kinds of IE originals; for instance, 
an initial k- in Uralic might have been used to substitute PIE *k-, *sk-, *kw-, *k’-, 
*g-, *9’_, *gh-, *ew- or *H-. Koivulehto also ingeniously exploits reconstructible inter- 
mediate stages of a change in progress; for example, a whole set of new Indo- 
Iranic etymologies is based on the idea that the satemization in Indo-Iranic (for 
instance, PIE *k’ > *¢ > s) proceeded through a depalatalized affricate phase and 
that this nonpalatalized *c (or *dz), lacking an exact counterpart in Uralic, could 
have been substituted by s- or -ks- (Koivulehto 2001: 252-7). 

As Koivulehto’s critics see it, he has pushed the exploitation of reconstructed 
phonologies (of different stages of reconstruction) and sound substitution to its 
limits. Koivulehto may be right in referring to the well-known fact that in inten- 
sive language contact anything can be borrowed, but as long as there is hardly 
any other evidence for the intensive character of early Uralic-IE language con- 
tacts (for instance, convergent developments in Uralic and IE morphosyntax), his 
etymologies, although technically flawless, remain vulnerable. As the methods 
of historical linguistics are based on the interaction of historical phonology and 
lexicology (etymology), finding unambiguous evidence for language contact 
outside the vocabulary means a serious methodological challenge for historical 
language contact research — and, so far, this challenge remains unanswered. 

Similarly to Indo-European, the relationship between Uralic and Yukaghir, an 
indigenous language in Siberia, has triggered various speculations about pos- 
sible relatedness versus very early loan contacts. The arguments and the putative 
Uralic-Yukaghir vocabulary are summarized by Rédei (1999), who interprets 
the common words as loanwords belonging to different (Uralic, Finno-Ugric, 
Samoyedic) strata. 


3.2. Substrate studies 


In historical contact linguistics, there seems to be a growing awareness of the com- 
plexity of language contact situations in prehistoric Europe. These may well have 
involved extinct languages of unknown descent (cf. Schrijver 2001), also in the 
northernmost parts of Europe, where the linguistic map has radically changed 
due to the northward spread of both Uralic and Indo-European languages. 

The role of a substrate (“Proto-Lapp”) component in the genesis of the Saami 
languages has been a persistent question. The linguistic relatedness between 
Saami and Finnic is unmistakable, but the great differences in culture, identity, 
anthropology as well as certain vocabulary items and features of unknown origin 
in Saami have provoked diverse speculations about the Saami as a Palaeo-Arctic 
or even Asiatic people who only secondarily adopted their language from their 
Finnic neighbors. Recently, Aikio (2004) has shown that — purged of its racist 
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foundations in early Finnish nationalism — the hypothesis of a non-Uralic sub- 
strate in Saami can be supported by a systematic analysis of potential substrate 
vocabulary. Elaborating on the criteria presented by Salmons (1992), he lists an 
impressive number of probable substrate words, identifiable on the basis of their 
semantics (animals and plants, nature, topography, and weather conditions of the 
North), structure (un-Uralic phonotactics), and/or irregular sound correspondences 
between different dialects (pointing at multiple, separate borrowings). 

In central and northern Russia, the expansion of Slavic only began at the end 
of the first millennium CE and led to the assimilation of many presumably Finno- 
Ugric language varieties. The question of a Finno-Ugric substrate in Russian has 
time and again been dealt with in linguistic literature, most notably by Veenker 
(1967), and many linguists have paid attention to features of Russian that deviate 
from other Slavic and Indo-European languages and resemble certain Finno-Ugric 
languages (such as the loss of the habeo verb in favor of the mihi est construction, 
or the abundance of unipersonal constructions, or, in general, “anti-analytism” — 
Weiss 2004). However, the substrate interpretation often competes with other expla- 
nations (for instance, the mihi est construction does have JE roots as well), and of 
the statements circulating in literature, at least the idea of the (Moksha) Mordvin 
origin of the Russian akanje (reduction of unstressed 0) must be considered 
unfounded (Ravila 1973). 

In the last few years, the question of a Finno-Ugric substrate in Russian has been 
taken up again, now concentrating on the toponymy of northern and central Russia. 
There are interesting research results (see e.g. Saarikivi 2006; Nuorluoto 2006) which 
point at a complex language situation in northern Russia before Slavicization, involv- 
ing diverse interrelated and interacting Finnic, Para-Finnic, Saamic, Para-Saamic, 
or perhaps even Para-Permic language varieties (as well as languages of unknown 
descent). As Helimski (2006) states, on the basis of recent research the family tree 
of the Finno-Ugric languages could be partly re-drawn, expanding the Finnic-Saamic 
group to a Northwest Finno-Ugric branch. 

As for Baltic, it is obvious that Latvian bears some traces of the Finnic 
(Livonian and/or other varieties) languages once spoken in parts of present-day 
Latvia (cf. Zeps 1962). There are Finnic toponyms and also some loanwords, 
especially in the so-called Livonian dialects of Latvia; it is also frequently stated 
that certain characteristic innovations in Latvian such as word-initial stress (in 
contrast to the more archaic prosody of Lithuanian) are due to Finnic influence. 
Whether there are traces of older Finno-Ugric/Pre-Finnic substrata in Baltic or 
Balto-Slavic has probably not yet been sufficiently researched (cf., however, 
Kallio 2005). 


3.3 Competing explanations — multiple causation? 


In etymological studies — the most intensively cultivated part of contact-linguistic 
studies involving Finno-Ugric — the last few years have seen a heated debate between 
Finnish etymologists, concerning competing etymologies for numerous Finnish 
lexemes and partly prompted by the preparation and publishing of the new 
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etymological dictionary SSA (Suomen sanojen alkuperd: Etymologinen sanakirja [The 
origin of the Finnish words: an etymological dictionary]). This debate is also an 
indirect consequence of two different approaches in etymology: As loanword 
researchers often depart from a certain mechanism of sound substitution or 
from certain phono-morphological or semantic features typical of loanwords, they 
typically end up discovering whole clusters of new loan etymologies — and risk 
ignoring the language-internal mechanisms of lexical morphology and analogy. 
Editors of etymological dictionaries, on the other hand, must deal with the word 
stock of a language as a whole and pay attention to the internal relations, irreg- 
ular and/or secondary networks of word formation, analogy and association within 
the lexicon — and, tantalized by the rich derivational morphology and diverse 
systems of forming ideophones, sound-symbolic and expressive vocabulary in Finnic 
(cf. Mikone 2001), they may lose sight of the possibility of borrowing. 

To give but one example: is Finnish puhdas ‘clean’ (< Pre-Finnic *pustas) an early 
IE loan (‘cleansed by sifting or winnowing’?, cf. Pre-Germanic *powH-eye/o-, 
OHG fewen ‘to sift’), as stated by Koivulehto, or does it belong to the Finnic 
family of descriptive words for ‘blow, puff’ etc. (cf. puhu-, puhalta- ‘to blow’ and 
also poh-ta- ‘to sift, to winnow’), as suggested by Eino Koponen, who is also one 
of the authors of SSA? (Note that also the Finnish verb puhu- has lost its original 
expressive motivation and is now the neutral word for ‘to speak, to talk’; this 
‘fading’ of sound symbolism plays a central role in Koponen’s model.) In his 
summary of the debate, Koivulehto (2001b) sharply criticizes Koponen’s technique 
of (allegedly) operating with monosyllabic “descriptive roots,” which allows 
for unsystematic and arbitrary vowel changes and “stem extensions.” However, 
Koivulehto’s expert criticism misses one important point: Koponen’s “root 
method” does not exclude the possibility of words being primarily loanwords which 
are only secondarily attached to a family of expressive words. In the same vein, 
Vesa Jarva (2001) has investigated the process of intertwining between loanwords 
and native expressive words or elements. In effect, this would mean introducing 
the idea of multiple causation into loanword research (cf. also Laakso 2001a). 

Similar questions of native versus borrowed also arise in morphology, phono- 
logy, and syntax. In particular, the contacts between the westernmost Finno-Ugric 
and Germanic (or Standard Average European) languages are often mentioned 
as the primary explanation for certain (morpho)syntactic “Europeanization” 
phenomena. As for Finnic, typical morphosyntactic examples recurring in litera- 
ture are the development of a perfect tense with the auxiliary BE (reflecting the 
HAVE perfect in many SAE languages), the agreement of adjective modifiers 
(as in Finnish iso-ssa talo-ssa ‘big-INESSIVE house-INESSIVE’ ‘in a big house’; cf. 
Hungarian (a/egy) nagy hdz-ban ‘(the/a) big house-INESSIVE’) or the word order 
change from SOV (in most Uralic languages) to SVO. For Hungarian, similar exam- 
ples are the grammaticalization of definite and indefinite articles (definite a(z) from 
the demonstrative pronoun az, indefinite egy from the numeral egy ‘1’) and the 
debated verbal prefixes or “preverbs” used for adverbial, aspectual, or Aktionsart 
meanings and thus functionally resembling the verbal prefixes in German and 
Slavic (Kiefer & Honti 2003). In phonology, the most famous case is probably 
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the radical simplification of the consonant system and the development of the 
consonant gradation (a morphophonological consonant alternation) in Finnic. In 
his often-cited paper, Posti (1954) explained all these consonant changes with a 
Germanic superstrate, assuming that Proto-Finnic speakers would have imitated 
the prestigious accent of their Germanic neighbors. 

In all these cases, the similarity between possibly contact-induced phenomena 
and their purported models in neighboring languages — between the Finnic BE 
perfect and the SAE HAVE perfect, for instance, or between the Finnic consonant 
gradation and Verner’s Law in Germanic — seems obvious to an outsider. Contact 
explanations for phonological or morphosyntactic phenomena can be connected 
with Sprachbund hypotheses or, in any case, with the amply attested lexical 
contacts, the main direction of loanwords often being from the neighboring IE 
languages to Finno-Ugric. However, many of these phenomena also have internal 
and/or general explanations. For example, the consonant gradation in Finnic-Saami 
is also clearly connected with the archaic word architectonics best preserved in 
these languages (Helimski 1995), and the simplifying or reductive consonant changes 
in Pre-Finnic (such as *§ > h, *mt > nt) are “natural” and do not necessarily require 
any external explanations (Kallio 2000a). The Finnic BE perfect, employing the 
copula and a past participle of the main verb (‘he is gone’), can also be compared 
with past tense categories developed from past participles in other Finno-Ugric 
languages, and the agreement of adjective modifiers could be regarded as an exten- 
sion of the agreement of modifying pronouns (as in Hungarian ab-ban a hdz-ban 
‘in that house’). The “verbal prefixes” in Hungarian are not genuine verbal 
prefixes but separable preverbs with native etymologies and sometimes even 
functionally similar cognates in related languages. 

In a synthesis of critical responses to diverse contact hypotheses, Honti (2007) 
sharply criticizes contact explanations in Finno-Ugric morphosyntax and ends 
by quoting Peter H. Nelde: Language contact research has no methodology yet. 
In Honti’s “insider” view, obviously, the internal explanation is always to be pre- 
ferred, other things being equal — external influences may of course contribute to 
a greater frequency of a construction already marginally present in a language, 
or individual constructions may be calqued from another language. This is in line 
with the traditions of historical linguistics: (systematic) language change is inter- 
nally motivated by default, and external explanations are only needed when all 
else fails. Honti is also certainly right in criticizing outsiders for jumping to con- 
clusions: in typology and areal linguistics, unfounded or misinformed statements 
about Finno-Ugric languages are not rare. 

Part of the problem, however, seems to be that we have very little knowledge 
of what really happens in language contact and very little means of predicting 
the outcome, as it also depends on conscious actions and choices of language users. 
For these, the identification of similarities between contacting languages, “dock- 
ing” (Laakso 2001b), may play a crucial role: speakers choose words, elements, 
and structures that match both languages. Considering this, exploring multiple 
causation and interaction of various factors might be a promising avenue for 
further research. 
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3.4 Modernization, globalization, and the changing 
character of language contact 


In the early twentieth century, Finno-Ugric speakers in the three nation-states were 
largely monolingual, and among many Finno-Ugric minorities, at least a few older 
speakers could be found whose command of the majority language was very weak. 
(True, there are great differences between Finno-Ugric minorities in this respect; 
in some Mari speaker communities, for instance, bilingualism in some Turkic 
language has been common for a long time already, and North Saami speakers 
a hundred years ago often had a good command of Norwegian, Swedish, and/or 
Finnish.) Traditional Finno-Ugric studies could thus operate with idealized 
monolingual speaker communities, and field linguists chose their informants so 
as to represent as “pure” a language variety as possible. 

Today, the situation has changed radically. In the Finno-Ugric nation-states, the 
school system now aims at providing everybody with a practical command of at 
least one “world” language, and according to the Eurobarometer survey of 2005,7 
87 percent of the population of Estonia, 66 percent of the population of Finland, 
and 29 percent of the population of Hungary can speak at least one language other 
than their mother tongue at the level of being able to have a conversation. In Finland, 
younger generations are generally expected to know English, and the dominance 
of English in business, science, and entertainment is already strong enough to make 
experts concerned about English interference in young people’s written Finnish 
or the status of the Finnish language in science and professional communication. 
Whether this kind of language contact, involving language use in an unprecedented 
diversity of new modalities, styles, and genres, is different from language con- 
tact situations traditionally researched in contact linguistics (in connection with 
minorities or migrant groups) remains to be investigated. 

For many Finno-Ugric minorities, the school system (at least at higher levels) 
only exists in the majority language, the majority language dominates in most 
domains of language use outside home and family, and practically all speakers 
of today are bilingual. For these speakers, the grammar of the majority language 
is psychologically real and operates, for instance, within the abundant code-switches. 
Gender assignment in Russian words (Finno-Ugric languages have no grammat- 
ical genders) or the inflection of Russian numerals (in colloquial speech, years and 
dates, for instance, are often inserted in Russian) are mastered by modern Finno- 
Ugrians of Russia without difficulty. For quite a few younger speakers, even if 
they claim to be bilingual and identify themselves with the heritage language speaker 
community, the majority language might well be their “first,”: i.e. primary or “matrix” 
language. 

The endangerment and obsolescence of Finno-Ugric minority languages have 
been subject to diverse sociolinguistically oriented studies, the most famous 
example probably being Susan Gal’s (1979) investigation on language shift among 
the Hungarian minority in Burgenland, Austria. These studies typically concen- 
trate on the social conditions of language use, language choices, and language 


Contact and the Finno-Ugric Languages 611 


shift or on attitudes of speakers toward the maintenance or revitalization of 
minority languages (cf. Huss 1999). There is much less research on what the Finno- 
Ugric minority languages in their present-day condition really are like. Although 
it is generally acknowledged that today’s Finno-Ugric minority languages often 
clearly differ from the “classical” varieties, i.e. texts recorded from old and 
conservative informants in the most fruitful period of Finno-Ugric linguistic 
fieldwork before World War I, systematic comparisons between modern and “pure” 
language varieties seem to be lacking. 

In any case, what is known about the current state of Finno-Ugric minority 
languages indicates a wide spectrum of multilingualism, different degrees of 
language skills, and a diversity of attitudes influencing the choice of language. 
Among the Karelian informants of Sarhimaa (1999), there were terminal speak- 
ers with a restricted command of Karelian but also speakers who still could speak 
fluent “Traditional Karelian” with little or no code-switchings into Russian; at the 
same time, they had mixed varieties or codes at their disposal, employing Russian 
(Sarhimaa calls this code “Karussian’”) or even Finnish elements. For modern bilin- 
gual speakers, alternating between codes can be a conscious “act of identity.” 
Conversely, modern speakers may choose not to mix codes. In her recent study 
of the language of two Hungarian-speaking families in Burgenland, David (2008) 
notes that her informants, despite some signs of insecurity in their command 
of Hungarian and despite abundant calques and syntactic interference from 
German, hardly ever switched into German in the interview situation. 

During the 30 years since Gal’s study, there have been changes in the status 
of Hungarian in Burgenland, including the introduction of Hungarian into the 
curricula of some primary and secondary schools. Standardization together with 
the increasing use of minority languages in education and media adds a new, 
important dimension to the traditional research of Finno-Ugric language contacts. 
Expatriate varieties of Finnish and Hungarian must choose between creating a 
standard of their own (like the old autochthonous Finnish minority in north Sweden, 
now developing their own Medinkieli) and using the homeland standard (as the 
Hungarian varieties spoken in the neighboring countries of Hungary, or the numer- 
ous post-World War II Finnish emigrants in Sweden do). 

In Hungarian-speaking areas especially, there is an increasing tension between 
the puristic tradition of language planning in the homeland and the reality of 
multilingualism in expatriate speaker communities. In the 1990s, this tension brought 
about a debate on “linguistic treason vs. rescue of language” (Kontra & Saly 1998; 
summarized e.g. by Maraz 2006). To support a more pluricentric idea of the 
Hungarian language, a “de-trianonization” project has been launched: collecting 
words used in expatriate Hungarian varieties and including them in new diction- 
aries of Standard Hungarian. Another development increasing pluralism con- 
cerns Finnish and Estonian: As the large immigrant Russian minority in Estonia 
and the relatively small but very rapidly growing immigrant communities in Finland 
must be integrated, there are probably more people now than ever before learn- 
ing Finnish or Estonian as a foreign language. This could mean that Finnish and 
Estonian are gradually losing their character as ethnic in-group languages and 
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that native speakers will have to develop a greater tolerance towards their language 
as used by nonnatives. 


NOTES 


1 For more precise speaker statistics, see www.suri.ee/uralic.html. I have deliberately 
avoided presenting precise numbers, as there are problems in the reliability and inter- 
pretation of the available data. 

2 http://ec.europa.eu/public_opinion/archives/ebs/ebs_237.en.pdf. The high percentage 
in Estonia is partly explained by the fact that 62% of Estonians know Russian, due 
to the compulsory Russian teaching in the Soviet period. The second most popular 
foreign language in Estonia (41%), the most popular in Finland (60%), and the most 


popular beside German (16%) in Hungary is English. 
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30 Language Contact in 
the Balkans 


BRIAN D. JOSEPH 


1 The Languages and Their Convergent Character: 
Introducing the “Sprachbund” 


Southeastern Europe is the home of an intense contact zone that takes in a number 
of distantly related languages (and then some - see below). The Balkans, as this 
area is generally known, is the most thoroughly studied contact area in the 
world, and thus occupies a special place in contact linguistics as one of the most 
important regions for understanding the mechanisms and results of language 
contact. A rough and mountainous region that forms a peninsula bounded by 
the Adriatic Sea on the west and the Black and Aegean Seas on the east and south, 
respectively, the Balkans have constituted a crossroads for speakers of many dif- 
ferent languages since at least the second millennium BCE. 

The interrelations among speakers in the Balkans in ancient times are of con- 
siderable interest since clearly various sorts of cross-language transfer showing 
the effects of language shift (substrata) and borrowing must have occurred. These 
effects are especially evident in the lexicon — for example, there appears to be 
a layer of Indo-European but non-Greek words in ancient Greek (e.g. aleipho: 
‘rub, anoint’ where the #a- and the -ph- are unexpected and the -lei- derives from 
Indo-European “*lip-, seen in genuine Greek forms such as lip-os ‘fat’) — but other 
sorts of effects involving various ancient languages could be (and have been) 
imagined. The languages in question include the following: Contintental Celtic 
(in some form), Dacian (Daco-Mysian), Gothic, Greek, Illyrian, Latin, Macedonian, 
Phrygian, Pelasgian (“Pre-Greek”), and Thracian. But these prehistoric contacts 
present a number of challenges to language historians, as some of these languages, 
quite frustratingly for scholars, are only very sparsely attested or known just 
from brief mention in ancient sources. And some may not even be identifiable 
as individual languages.’ Thus much about their interactions must remain 
speculative.” 

For all the intrinsic allure of the study of the ancient prehistoric situation and 
the speculations about contact that these languages offer, it is the more modern 
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situation, dating from about 1000 CE that has attracted the most attention among 
researchers in contact linguistics. 

The range of languages relevant in the Balkans in this more recent period extends 
over a number of branches of Indo-European and even beyond that family as well. 
They include the following (excluding languages, such as Tagalog or Arabic in 
Greece, that have entered the Balkans quite recently as the result of employment- or 
war-related modern immigration trends), listed with some explanatory annotations 
where deemed appropriate: 


(1) Albanian (both major dialects: Geg (north) and Tosk (south)) 

Armenian (spoken in Bulgaria) 

Circassian (Adygey variety; spoken in Kosovo area of (former) Yugoslavia) 

Bulgarian 

German (spoken in Romania) 

Greek (including the very divergent dialects like Tsakonian and Pontic (the 
latter only in the Balkans proper via the post-war population exchanges 
of the 1920s)) 

Hungarian (spoken in Romania) 

Italian (spoken in Istria area of (former) Yugoslavia) 

Judezmo (also known as Ladino or Judeo-Spanish) 

Macedonian (the Slavic language, thus different from Ancient Macedonian, 
given above) 

Romanian (more accurately listed as four separate languages: Aromanian 
(Vlach), Megleno-Romanian, Daco-Romanian, and Istro-Romanian) 

Romani (the Balkan variety of the Indic language of the Gypsies) 

Ruthenian (also known as Rusyn, spoken in Vojvodina area of (former) 
Yugoslavia) 

“Serbo-Croatian” (now, after the break-up of Yugoslavia, perhaps to be con- 
sidered as three separate languages: Bosnian, Croatian, Serbian (with a 
Montengrin possibly developing as well’)) 

Slovak (in a small enclave in Vojvodina area of (former) Yugoslavia) 

Slovenian 

Turkish 


Depending in part on just how the Balkans are defined geographically, especially 
as to the northern border,’ these languages may be called “languages of the Balkans,” 
a purely geographic designation.” 

A further distinction, important for contact linguistics, needs to be made here. 
In particular, one needs to recognize further a class of “Balkan languages,” referring 
to those languages of the Balkans that show considerable structural and lexical 
convergence due to centuries of intense, intimate, and sustained contact involving 
multilaterally bilingual speakers. The effect is what one of the first commentators 
on Balkan convergence, Kopitar (1829: 86), described as an area in which “nur eine 
Sprachform herrscht, aber mit dreyerley Sprachmaterie” (‘only one language-form 
dominates but with threefold language-material’). Under this more restricted 
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designation, the following languages can be included as “Balkan languages,” listed 
again with appropriate annotation as needed for clarification concerning the 
extent of convergence shown: 


(2) Albanian 
Aromanian 
Bulgarian 
Daco-Romanian 
Greek (most dialects, including Tsakonian (but excluding Asia Minor dialects)) 
Judezmo (maybe only at the phonological and lexical levels) 
Macedonian 
Megleno-Romanian® 
Romani 
Serbian (with Torlak dialects of southeast Serbian being most relevant, 
much less so both the Croatian and Bosnian standards) 
Turkish (not a “full” structural participant but crucial nonetheless). 


These Balkan languages represent several different genetic affiliations: Albanian 
is its own branch within Indo-European, as is Greek; Bulgarian, Macedonian, 
and (Bosnian-Croatian-)Serbian are all South Slavic languages, again within 
Indo-European; the Romanian group and Judezmo belong to the Romance 
languages (of the Italic branch of Indo-European); Romani belongs to the Indic 
branch of the Indo-Iranian subgroup of Indo-European; and Turkish is part of 
the Turkic language grouping, generally believed to be part of a larger Altaic 
family. It is convenient to refer to these languages more generally as Balkan 
Albanian, Balkan Greek, Balkan Romance, Balkan Romani, and Balkan Turkish, 
to distinguish the Balkan varieties of these languages or language groups 
from their relatives outside the Balkans, inasmuch as the non-Balkan varieties 
generally do not show the structural and lexical properties that their Balkan 
relatives do. 

These languages offer some diversity to be sure, but more significantly, also 
cross-language similarities that have led to the recognition of a key construct for 
language contact studies — the “sprachbund” (French union linguistique, Russian 
jazykovoj sojuz, English linguistic league or linguistic area or even just sprachbund) — 
that has come to be applied to geographically based convergence zones, including 
South Asia (Emeneau 1956; Masica 1976), Meso-America (Campbell, Kaufman, and 
Smith-Stark 1986), and the Pacific Northwest of the United States and Canada (Beck 
2000), to name just a few. A sprachbund can be defined as any group of languages 
that due to intense and sustained bilingual contact share linguistic features, 
largely structural in nature but possibly lexical as well, that are not the result of 
shared inheritance from a common ancestor nor a matter of independent inno- 
vation in each of the languages involved.’ 

Taking note of, and ruling out, common inheritance is an important part of 
recognizing a sprachbund, inasmuch as this concept, when first formulated by 
Trubetzkoy in 1923 (with his 1928 pronouncement at the First International 
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Congress of Linguists being the better known and more widely cited source), was 
explicitly contrasted with a Sprachfamilie (language family’), both being types of 
Sprachgruppen (‘language group(ing)s’). In a language family, the languages are 
genetically related (in the technical linguistic sense of deriving by a direct lineal 
descent from a common source). Trubetzkoy’s new concept, by contrast, was for 
languages that are geographically related, being located in the same region and 
often coexisting side by side in the same territory, but are not genetically related, 
and which yet, due to prolonged contact, show resemblances in form and struc- 
ture. For Trubetzkoy, the Balkans were the prime example of this new type of 
grouping. Others before him had made similar observations about the Balkans, 
though without generalizing to a new construct, let alone offering a handy label 
for it, or contrasting it clearly with genetic groupings of languages: Kopitar 1829, 
as noted above, drew attention to a single feature, the postposed definite article, 
Miklosich 1861 noted several convergent features involving Balkan languages, 
among which he included Modern Greek, and Sandfeld 1926, more widely 
known through the French translation of 1930, elaborated in a systematic way on 
a large number of such features in phonology, morphology, syntax, and lexicon, 
with the majority being lexical and phraseological in nature. 


2 The Convergent Features Themselves: 
Balkanisms 


It is appropriate at this point to flesh out the somewhat abstract references to 
structural and lexical convergences with some concrete details. While there are 
several convergent features — which may be called “Balkanisms” — that have attracted 
considerable attention in the rather large literature on Balkan linguistics in the 
period since Sandfeld, appearing for instance in most of the handbooks (Schaller 
1975; Banfi 1985; Feuillet 1986; Asenova 1989/2002; and Demiraj 1994/2004) but 
also in specialized studies (e.g. Joseph 1983; Friedman 2003), largely because they 
are widely realized in the Balkan languages, there are actually dozens of features 
that link small clusters of languages and dialects in the Balkans (see Friedman 
& Joseph, forthcoming for details and see also the discussion below). The following 
is a representative list of those that are widespread, and a sampling of those that 
are more localized, thus overall giving a feel for the most significant relevant 
features and types of features shared by various of these languages; they cover 
phonology (a-f), morphology (g-j), syntax (k—p), and lexicon (q-r): 


(3) a. the presence of a (stressed) mid-to-high central vowel; this feature is found 
in Albanian, Romanian, Bulgarian, some dialects of Macedonian and 
Bosnian-Croatian-Serbian, some Romani dialects, and Turkish; 

b. the presence of i-e-a-o-u in the vowel inventory without phonological 
contrasts in quantity, openness, or nasalization; this feature is found 
in Greek, Tosk Albanian, Romanian, Macedonian, Bulgarian, Torlak 
Serbian, and Romani; 
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c. devoicing of word-final stops; this feature is found in Bulgarian, 
Macedonian, Megleno-Romanian, Modern Greek (dialectally only in one 
part of northern Greece, as reported in early twentieth century), some 
Romani dialects, South Montenegrin and Torlak Serbian, and Turkish 
(somewhat generally but with greater consistency in West Rumelian 
Turkish); 

d. development of nasal + voiced stop clusters (e.g. [mb]) out of nasal + 
voiceless stop combinations, so that the former clusters are rare or 
nonexistent, or present only in loanwords; this feature is found in 
Albanian, Aromanian, and Greek; 

e. presence of 6/8 (voiced/voiceless interdental spirants); this feature 
is found in Greek, Albanian, Aromanian, and (mostly in loanwords) 
dialectally in Macedonian; 

f. realization of /mj/ as [mnj]; this feature is found in Greek and 
dialectally in Arvanitika (Tosk Albanian dialects spoken in Greece); 

g. a reduction in the nominal case system, especially a falling together 
of genitive and dative cases; this feature is found in Greek, Albanian, 
Romanian, Bulgarian, and Macedonian (though note that the latter two 
have eliminated other case distinctions as well); 

h. the formation of a future tense based on a reduced, often invariant, form 
of the verb ‘want’; this feature is found in Greek, Tosk Albanian, 
Romanian, Macedonian, Bulgarian, Bosnian-Croatian-Serbian, and 
Romani; 

i. the use of an enclitic (postposed) definite article, typically occurring after 
the first word in the noun phrase; this feature is found in Albanian, 
Romanian, Macedonian, Bulgarian, and Torlak Serbian; 

j. analytic comparative adjective formations; this feature is found in 
Greek, Albanian, Romanian, Bulgarian, Macedonian, and Romani, as 
well as in Turkish; 

k. marking of personal direct objects with a preposition; this feature is 
found in Aromanian, Daco-Romanian, and Megleno-Romanian (via 
inheritance) and in southern Macedonian dialects; 

1. double determination in deixis, that is a demonstrative adjective 
co-occurring with a definite article and a noun (thus, ‘this-the-man’); this 
feature is found in Greek, southern Macedonian, and to a limited extent 
in Albanian too; 

m. possessive use of dative enclitic pronouns; this is found in South Slavic 
and in Greek; 

n. the use of verbal forms to distinguish actions on the basis of real or 
presumed information-source, commonly referred to as marking a 
witnessed/reported distinction but also including nuances of surprise 
(admirative) and doubt (dubitative); this feature is found in Albanian, 
Bulgarian, Macedonian, and Turkish, and to a lesser extent in Romani, 
Serbian, and Romanian (the presumptive); 
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o. the reduction in use of a nonfinite verbal complement (generally called 
an “infinitive” in traditional grammar) and its replacement by fully 
finite complement clauses (see Joseph 1983); this feature is found most 
intensely in Greek, Macedonian, Bulgarian, Serbian (especially the 
Torlak dialects), and Romani, but also in Albanian (especially Tosk) 
and Romanian; 

p. the pleonastic use of weak object pronominal forms together with full 
noun phrase direct or indirect objects (“object doubling”); this feature 
is found in Greek, Albanian, Romanian, Bulgarian, and Macedonian, 
dialectally in Serbian, and to a limited extent in Romani; 

q. the formation of the “teen” numerals as DIGIT-‘on’-TEN; this is found 
in Albanian, South Slavic, Aromanian, Megleno-Romanian, and Daco- 
Romanian; 

r. lexical parallels, including shared phraseology (e.g. a phrase that is 
literally “without (an)other” meaning ‘without doubt’, or “eat wood” 
meaning ‘take a beating’), and numerous shared loanwords many of 
which are from Turkish. 


Some of these features are stated as synchronic typological characteristics, e.g. 
pleonastic use of weak object pronouns, while some are stated in historical terms, 
e.g. reduction of cases, while still others lend themselves to both sorts of framing, 
e.g. widespread use of finite complementation due to the replacement of infinitives. 
Both dimensions — the synchronic and the diachronic — are appropriate to consider 
in a discussion of the Balkan languages, since it is historical events (of contact 
and of reaction to that contact) that have led to the convergent typological state 
found in these languages. 

Without belaboring the point, it is important to note that these features are gener- 
ally taken to be significant indications of contact-induced convergence because, 
except as noted (e.g. with features involving the various forms of Romanian), 
they are not features inherited from a common protolanguage (e.g. Proto-Indo- 
European is typically reconstructed without a definite article and with synthetic 
(inflectional) analytic adjective formations, so (i, j, 1) clearly could not be inherit- 
ances, and any features involving Turkish and Indo-European languages 
similarly could not be due to genetic relatedness). Moreover, it is often the case 
that they are not found in varieties of the languages outside of the Balkans (e.g. 
other Romance languages use a preposed definite article and other Slavic languages 
generally lack an article, and other Romance languages have well developed infini- 
tival usage, as do other Slavic languages). Further, occasional occurrence of some 
of these features in other closely or distantly related languages (e.g. a postposed 
definite article in northern Russian dialects and in Scandinavian languages, or object 
doubling in Spanish, or a ‘want’-based future in English) does not vitiate the 
significant clustering of convergent features in the Balkans. To some extent, then, 
the geography here allows for what some (e.g. Campbell 1985; 1997: 330-1; 2006: 
14) have called a “circumstantialist” argument for a sprachbund. It is also the case 
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that much is known about the history of these languages, and comparisons 
can be made with earlier stages of Greek, for instance, where infinitival usage 
abounded, or Slavic, with a well-developed case system), permitting a judgment 
as to the innovative convergence (and concomitant divergence from earlier struc- 
tural patterns) and thus allowing for what some (e.g. Campbell 1985; 1997: 330-1; 
2006: 14) have called a “historicist” argument for a sprachbund. 


3 Causes of Convergence in the Balkans 


A key issue in the study of the Balkans is trying to determine what the causes 
are for the convergences noted. For the most part, scholars agree that language 
contact is at work, though it must be admitted that some of the developments 
may well be independent in each language, at least for individual features, even 
if not for all of the similarities between and among the various languages. 
Joseph (forthcoming; see also Friedman & Joseph, forthcoming: ch. 5) argues that 
the stressed schwa is not a contact-induced feature but rather one that developed 
in each language on its own, and aspects of the emergence of the ‘want’-based 
future, especially the reductions to a highly “abbreviated” form, may well have 
taken place on a language-by-language basis, given that full and reduced variants 
coexisted (or continue to coexist) in each language for some time. 

But even if it is granted that contact is responsible, the question arises as to 
what kind of contact it was, and what contact-related mechanisms were at work 
in the formation of the Balkan sprachbund. One can imagine several possibilities: 


(4) a. substratum effect (ie. first language speakers, shifting to a second lan- 
guage, carry over their habits and structures of the first language into 
the second, producing an altered form of the second language); 

b. adstratum effects (i.e. structures from a second language are imported 
by speakers into their native (first) language, e.g. for reasons of prestige); 

c. pidginization (i.e. a simplified version of a target language is developed 
by speakers of several different languages in a situation of communica- 
tive necessity); 

d. speaker-to-speaker accommodation to (imperfect) skills of an interlocutor 
(i.e. a native speaker of the target language adjusts his/her speech to match 
the perceived level of ability in that language by a nonnative speaker; 
this may involve selection by both speakers of structures for the target 
language that are “comfortable” to both, that is, acceptable as a variant 
in the target language and matching some structural element in the other 
language). 


Several comments about these putative causes are in order. First and foremost, 
the same feature often has been explained by different scholars in different ways. 
For example, a substratum explanation has sometimes been proposed for the 
loss of the infinitive, but the chronology needed to make that work, involving a 
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prehistoric substrate language whose effects surface only in the medieval period, 
is difficult and argues against it; rather, either an adstratum account (so Sandfeld 
1930, with Greek as the prestige language) or pidginization effects (Rozencvejg 
1976), or, perhaps better, a mix of pidginization and accommodation drawing on 
language-internal tendencies can be posited (see Joseph 1983 for discussion of all 
of these possibilities). 

Second, these causes are not mutually exclusive, in that one might be right for 
one feature and another right for another feature. The postposed definite article, 
for instance, unlike the infinitive developments, could well be a real substratum 
effect, according to the rather compelling account offered by Hamp (1982). But 
other features are more amenable to other accounts, as the range of possibilities 
concerning the infinitive-replacement shows. 

What all these accounts have in common is that they involve, in one way or 
another, multilingualism. The question to be asked, then, is what sort of multi- 
lingualism is at issue: casual and sporadic or intense and regular, unidirectional 
or mutual, intimate, or just what? 

As the earlier discussion makes clear, it seems that for the Balkans, for the most 
part (and maybe for sprachbunds in general), it is intense, intimate, and mutual 
multilingualism that is decisive. Speakers of different languages, living side by side 
for centuries (which, for the Balkans, corresponds to what is known historically 
about the coexistence of several languages co-territorially in multilingual villages, 
towns, and cities), and needing to communicate with one another on a variety of 
levels, necessarily were familiar with one another’s languages to some degree, and 
accommodated in their usage of their own language to the often imperfect (but 
possibly quite good) knowledge of that language on the part of speakers of other 
languages that they interacted with. Speakers of the target language, it can be 
posited, selected for structures that had ready analogs in the other language, in 
effect streamlining their own usage in the direction of that of others. Speakers of 
the other language, for their part, would often have produced structures in the 
target language that showed the effects of interference (substratum influence) 
from their own native language. This mutual accommodation on a base of native 
language interference would naturally lead to the sort of convergence results that 
characterize a sprachbund. Note that Thomason and Kaufman (1988) posit just 
such a social context as essential for the development of a sprachbund, namely 
with the relevant speech communities each maintaining their own linguistic 
identity in spite of the extensive and intimate contact and thus with some members 
of the groups of necessity being bi- or multi-lingual. 

Although it might seem that the evidence of overwhelming convergence alone 
confirms the hypothesis of language contact being involved in the formation of 
the Balkan sprachbund, there is direct evidence of the sort of contact that breeds 
a sprachbund, namely the intense and intimate bilingualism referred to above. 
The evidence in question is certain types of lexical borrowings, which requires a 
bit of explanation. Borrowing of lexical material in and of itself can occur with 
only casual or even very little contact between speakers; for the latter situation, 
for instance, the case of learned borrowings through the medium of written texts 
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can be cited. But in some instances, borrowing occurs under circumstances of close 
and sustained contact in such a way that it is clear that the speakers of different 
languages must have been communicating with one another on a regular and 
everyday basis, and under conditions where there was some knowledge of the 
others’ language involved, in short an intimate contact situation with a degree of 
bilingualism. 

The particular borrowings in question are examples of what Bloomfield (1933: 
461) has called “intimate” (i.e. non-need based) borrowing that “occurs when two 
languages are spoken in what is topographically and politically a single community 
... [and] extends to speech-forms that are not connected with cultural novelties.” 
They are especially revealing, since they necessarily involve real contact between 
and among speakers on an intense, regular, and sustained basis, in a more or less 
equal power situation.* Moreover, the intimate borrowings to be considered here 
involve items that are tightly tied to discourse; clearly if such forms pass from 
language to language, there must have been discourse, i.e. conversational inter- 
actions, between speakers of these different languages. 

One large area of such discourse-related borrowings involves various sorts of 
negation. There is no need for borrowing here at all, since the languages of the 
Balkans — as indeed surely all languages in general — had means for expressing 
negation. The incorporation of elements of negation from other languages, there- 
fore, must represent the result of close contact among the speakers. Moreover, 
some of the forms in question have an expressive function that is intimately tied 
to conversational interaction and is not really found outside of that context, and 
one, moreover, is paralinguistic and thus could really only spread through visual 
contact.” 

For example, Modern Greek and Macedonian have both borrowed the Turkish 
existential and emphatic negator yok ‘there is no... ; no!’ in its emphatic function. 
Thus Greek has [yok] (spelled <ytox>) and Macedonian has jok, both with the mean- 
ing ‘no way; not in the least’.”° Interestingly, and significantly for the view advo- 
cated here, it is the more highly conversationally based function of Turkish yok 
that is borrowed, not the more denotational existential sense. Turkish, for its part, 
has an interjection ba with the meaning ‘ohl!’,"’ which, according to Redhouse 
(1984), is a borrowing from Greek ba (spelled <pmo>) ‘ah well’ (but also, as a 
negator, ‘unh unh; no way’). And, the widespread Balkan gesture of an upward 
head nod to signal negation, found at least among speakers of Albanian, Greek, 
Romanian, and Turkish, may well reflect a diffusion from Greek, given what is 
known about Ancient Greek gestures and the fact that the distribution especially 
in Italy coincides with geographic limits of Magna Graecia (Morris et al. 1979); 
such an element of paralanguage could only spread through face-to-face interac- 
tion among speakers, that is, in an intimate contact situation. 

Other discourse-related borrowings include a large number of interjectional 
elements, presumably spread through face-to-face contact on a day-to-day basis. 
For instance, there is a form which can be glossed (roughly) as an “unceremoni- 
ous term of address” and stems ultimately from Greek (where there are some 55 
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different variants across Greek dialects — see Joseph 1997), and which, in various 
of its shapes, is widespread in the Balkans, as indicated in (5): 


(5) Turkish: bre, bire, be 
Albanian: ore, or, mor, more, moj, ori, mori, moré, mre, voré, bre 
Romanian: bre, ma, mari 
Bulgarian: more, mori, bre 
Macedonian: more, mori, bre 
Serbian: more, mori, bre 
Greek: bre, vre, re, are, mare, mari, oré, voré, ori, mbre, pre, more (etc.; 


this last is the source of practically all these forms). 


Similarly, there are several parallel exhortative elements to be found across the 
Balkans, most probably from Turkish ay (interjection) + de (from de-mek ‘to say’), 
as given in (6): 


(6) Romanian: haide/(2PL) haideti/(1PL) haidem ‘c’mon; gw’an; let’s go’ 
Serbian: hajde/hajdemo (1PL)/hajdete (2PL) 
Albanian: — hajde (SG)/hajdeni (PL) 
Greek: aide (spelled <duvte>) 


Note in this regard also Bulgarian and Macedonian ela, both from Greek éla ‘c’mon’ 
(the imperative of ‘come’).'* Continuing with interjections, one can also cite 
Albanian hopa and Greek opa! ‘oops’ (for something unexpected), ‘woo-hoo!’ 
(expression of joy); Albanian pa pa pa and Greek pa pa pa ‘alas!’ (for disgust); and 
Albanian aman and Greek aman ‘oh my!’ (from Turkish aman). 

Other highly expressive forms that are typically found in colloquial, and thus 
conversational, usage, also fit in here in that they too have diffused across the 
Balkans. In particular, one finds in these languages parallels in onomatopeia (and 
the like). For example, for a dog’s noise, Albanian has ham-ham, Daco-Romanian has 
ham, Greek has yav yav, and Turkish has hav hav, and for the noise for attracting 
a cat, Greek has ps ps ps, as also in Bulgarian and Daco-Romanian. There is of course 
the risk of attributing to contact here what might be thought of as universal, but 
since onomatopes (etc.) do vary across even related and contiguous languages 
(Spanish has [waw] for the bark of a dog while Portuguese has [kaw]]), the Balkan 
similarities, especialy when viewed against the backdrop of other parallels in 
expressive forms and other grammatical and lexical convergences, fit into a 
pattern worthy of the attention of contact-minded linguists. In a similar vein, the 
expressive reduplication with m- that is found in Turkish and other more eastern 
languages,” e.g. kitap-mitap ‘books and such’, occurs in Greek, e.g. dzandzala mandzala 
‘this and that’, literally ‘rags and such’ (cf. Levy 1980; Joseph 1984; 1995), Albanian, 
e.g. cingré mingré ‘trivia’, and Bulgarian (cf. Grannes 1978). 

Moreover, to elaborate somewhat on the parallel listed above in (3r), there are 
many calques — phraseological loan-translation parallels — in the Balkans in which 
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native language material is substituted for elements in other-language combinations. 
While there can be learnedisms that are calqued with no direct speaker contact, 
such as German Mitleid, which is a prefix-plus-noun combination based on Latin 
compassio (itself a calque on Greek sumpatheia ‘compassion, sympathy’), the collo- 
quial and expressive nature of these Balkan calques suggests a social milieu for 
their creation that is different from that involving learnedisms. In particular, they 
point to face-to-face speaker interaction. Moreover, they offer direct evidence of 
bilingualism, since speakers must be familiar enough with the other language to 
be able to figure out equivalences in their own language to the other language’s 
pieces in the phrase or form or combination of elements being calqued. Thus, as 
noted in (3r), Greek has troyo ksilo for ‘I get a beating’, but it is literally ‘I-eat wood’, 
with the choice of verb agreeing with the Turkish use of yemek ‘to eat’ in the expres- 
sion kétek yemek ‘to get a beating’ but literally ‘to-eat a-blow’. Similarly, what is 
literally ‘to take [someone’s] eye’ means ‘to dazzle’ in several languages and 
‘to cut [one’s] mind’ means ‘to decide’. 

Therefore, putting these two types of colloquial and expressive lexical evidence 
together, a strong case emerges for the conditions being present in the Balkans 
that were the essential ingredients for the convergence effects that characterize a 
sprachbund. They thus offer a further argument, along with the geography and 
the history of the Balkans, that the convergences listed in (3) are indeed indicative 
of a sprachbund. 


4 Assessing the Sprachbund: Localized versus 
Broadly Realized Convergence 


As suggested in the enumeration of Balkanisms given above in (3), there are 
two general types of contact-induced convergences to be recognized:" those that 
occur on a widespread basis among the various languages and those that are highly 
local in nature. The loss of the infinitive and its replacement by finite forms would 
be an example of the former type, and the occurrence of prepositional marking 
of personal direct objects would be an example of the latter type. 

From the general approach taken here, with the emphasis on actual speaker- 
to-speaker contact as the source of the diffusion of features, it should be clear 
that all diffusion should be taken to be on a localized basis. And, as indicated 
above, there are actually dozens more such local convergences to be found in the 
Balkans. 

This observation leads to two questions: first, how one is to reconcile the 
local effects with the broadly realized Balkanisms, and second, whether, in 
the face of localized convergences, it makes any sense to think of the Balkan 
languages in the broad terms that the “sprachbund” designation requires. In 
other words, is a sprachbund a viable construct if all the relevant contact takes 
place locally? 

The answer is that local diffusion, if given enough time and the right sort of 
contact at the relevant “edges” of locales, can lead to spread across larger areas, 
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a sort of “diffusionary chain reaction,” as it were. Thus a widely distributed fea- 
ture, as with the infinitive, would have started small and spread widely from that 
point, and a feature with a more limited range is one that either has not had the 
chance yet to spread further or has been checked before spreading further. 
Indeed, since much remains to be known about the spread of any linguistic inno- 
vations — those that are the result entirely of language-internal factors as well 
as those that are contact-induced — it perhaps makes no sense to worry about 
features that remain localized in their distribution. 

What we really have then is clusters of convergent languages, where the 
convergence is on various features in various locales. A sprachbund — the Balkan 
sprachbund in this case — thus is really to be defined as a cluster of such clusters 
(see Hamp 1989 especially on this view).”° 

This view of the Balkans, and of sprachbunds in general, has several advan- 
tages and addresses a couple of issues typically raised as potential problems for 
the sprachbund as a viable construct. First, by looking at the overall picture on 
a feature-by-feature basis, the fact that some convergent features may not be as 
strongly realized as others is not a problem, since there is nothing that says that all 
features must be found to the same extent in all languages or even be found to 
any extent in all of them. Thus, the absence of a postposed definite article from 
Greek does not vitiate the importance of this feature in linking Albanian, Romanian, 
and Balkan Slavic. Second, as noted in section 3, different causes may underlie 
different Balkanisms; they are not uniform as to their source. And of the lists of 
Balkanisms that are generally offered, it is hard to see how all of them must involve 
contact. Third, the fact that there are differences among the languages even with 
respect to convergent features — for instance the relatively recent recrudescence 
of an infinitive in Tosk Albanian (the originally nominal purpose construction of 
the sort pér té punuar ‘(in order) to work’ (literally ‘for (the-act-of-)working’)) in 
the face of the loss of such constructs in the other languages — is not a problem, 
since just as features spread on a localized basis, so too can they go off in their 
own direction on a localized basis. 

What this really means is that different features have different histories, but 
that is as it should be since not all of the features are tied to one another 
such that a change in one would necessarily trigger a change in another. 
Moreover, the history is crucial to what the languages are synchronically. 
In a sense then, in the Balkans, we are dealing with the aftermath of a period of 
intense contact leading to convergence; the modern standard languages, inasmuch 
as they were generally formed on the basis of contact-affected dialectal 
sources, show structural convergences as a relic of their histories; ongoing 
convergence continues, but on the local level rather than the “national language” 
level; the conditions that gave rise to the convergence, that created the 
sprachbund, are no longer present as far as the standard languages are concerned, 
though they do obtain in various multilingual locales still. Thus in the present 
just as in the past in the Balkans, the local dialects must be the main focus for 
the study of language contact, as they are, and have always been, where the 
action is. 
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NOTES 


10 


11 


Eric Hamp has argued, for instance (see Hamp 1994), that “Illyrian” may be a cover 
term used in ancient times in much the same way that “aborginal language” is used 
by many in Australia today or “Indian language” is used by many in the United States, 
in each case masking a considerably complex and diverse linguistic situation. 

The state of what is known about these languages is given a careful treatment in Katicic 
(1976), and Friedman and Joseph (forthcoming: ch. 1) provides a good summary 
overview. 

In the time since Montenegro voted in favor of independence in 2006, a new consti- 
tution has declared Montenegrin as the official language, but the usual trappings of 
a standard (and standardized) language, such as an official orthography, codified aspects 
of grammar and lexicon, and such, have yet to be developed. Hence I refer in what 
follows just to Bosnian-Croatian-Serbian, realizing that that triad may need eventu- 
ally to be expanded. 

By some accounts, the Balkans may start at Vienna, for instance, while others look to 
the Danube River as the northern edge. 

If Tagalog, Arabic, and such languages (including English even) are to be counted, 
then perhaps one might speak of an even broader designation of “languages in the 
Balkans,” of which “languages of the Balkans” would be a subset. 

Despite distinguishing here among Aromanian, Daco-Romanian, and Megleno- 
Romanian, quite rightly, as separate languages, I nonetheless occasionally, for the sake 
of convenience, refer simply to “Romanian” as a cover term for all three. 

There is some controversy as to what the threshold is for recognizing a sprachbund. 
Thomason (2001: 99) opts for three as this lower limit on the grounds that it trivializes 
the notion to allow just two languages with structural features in common to determine 
what should properly be thought of as a special grouping, while Friedman and 
Joseph (forthcoming: ch. 3) argue that convergence is convergence and that therefore, 
assuming other criteria are met, a two-member sprachbund should not be ruled out 
in principle. 

The situation with Romani bilingualism admittedly does not involve equal power 
structures, since it was unidirectional: Romani speakers learned the other languages 
around them but speakers of those languages generally did not learn Romani. Still, 
assuming that what Romani speakers learned was already Balkanized varieties of these 
other languages, by a process of “reverse interference” (see Friedman & Joseph 2009: 
ch. 3), whereby speaking another language can have an effect on one’s native language, 
Romani speakers could have assimilated their Romani to aspects of these other 
languages they came to speak. 

See Joseph 2000; 2001; 2002a; 2002b for more on parallels in the Balkans involving 
negation. 

Turkish also has an emphatic negative, presumably related to yok, with the form yo. 
However, despite the similarity to the Albanian word for ‘no’, jo, this Turkish form 
is unlikely to be the source of the Albanian, since jo is found even in the Arbéresh 
Albanian of southern Italy, an Albanophone area that shows little or no influence from 
Turkish. I am indebted to Eric Hamp for clarification on this important point. 

It is of course difficult to give precise definitions for interjections; the glosses here (and 
below) are intended just to give a feel for the form’s use. 
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12 Albanian eja ‘come!’ probably has a different origin and is not connected to ela. 

13. And elsewhere - see Southern 2005 on this particular expressive mechanism 
cross-linguistically. 

14 This assumes, of course, that the non-contact-induced convergences, especially those 
due to separate and independent developments in each language (as is probably the 
case with stressed schwa, as noted above), are properly excluded from consideration. 

15 Just as a galaxy is made up of constellations and other groupings of stars and planets. 
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Development of Arabic 


KEES VERSTEEGH 


Around the year 970, the famous Arabic grammarian Ibn Jinni (d. 1002) asked 
his Persian teachers, what they thought about the Persian language. They replied 
that Arabic was vastly superior to Persian, in logic and in beauty. At that time, 
the fourth century of the Islamic era, Arabic was the language that united the Islamic 
world, the language of the Qur’an, the language whose superiority was acknow- 
ledged by all Muslims. 

This attitude toward Arabic still persists among Muslims all over the world. 
In the West African country of Mali, for instance, Malinese students of Arabic, 
even though they are not particularly fond of Arabs, are convinced that any Muslim 
should be able to understand and speak Arabic, because this is the language in 
which they will be questioned by the angels in paradise. Learning Arabic is there- 
fore regarded by them as a sign of devotion to Islam (see Bouwman 2005: 125ff.). 

Throughout history, native speakers of Arabic and Muslims have always 
shared this belief in the superiority of Arabic. Yet, Arabic has always coexisted 
with a large number of other languages. Indeed, the number of people for whom 
Arabic is a second language is much larger than that of its native speakers. However, 
these other languages may be useful for communication in daily life or indeed in 
order to explain the Arabic message of the Qur’an, but they can never have the 
religious status of Arabic. This status belongs to Classical Arabic exclusively: 
the spoken Arabic vernacular has as little prestige as any other language. Within 
the diglossia of the Arab world, the spoken language serves as the Low variety, 
whose existence is ignored by grammarians and sometimes denied outright by 
the speakers themselves (Ferguson 1959a). Outside the Arab world, most believers 
learn only a rudimentary form of Classical Arabic, which is nonetheless revered 
by all of them as the language of God’s revelation to the Muslims, even if they 
do not master it themselves. Spoken forms of Arabic do not share in this rever- 
ence, neither in the Islamic world, nor in the language islands and in the Arab 
diaspora. 

In the modern world, speakers of Arabic are confronted daily with the reality 
of globalization, in which English is the language of technical progress and 
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wealth. They still believe in the superiority of the Arabic language, but have to 
accept that other languages have gained supremacy. In the former French colonies, 
the speakers’ attitude toward French has been shaped by the colonial period to 
the point where French is still associated with social success and technological 
progress. In Bentahila’s (1983: 31-5) study on language attitudes in Morocco, 
52 percent of the respondents declared that Classical Arabic was the most beauti- 
ful language (as against 22 percent French), whereas 70 percent regarded French 
as the most useful for study (as against only 18 percent Classical Arabic). 

This combination of what Ferguson (1959b) calls “language myths” and global 
realities has determined the relations between speakers of Arabic and those of 
other languages. In what follows, a distinction will be made between community 
language and superimposed language (Boumans 2007), either one of which may 
be socially dominant, i.e., the language to which speakers are most exposed in 
their daily life. 

The first three sections of this chapter deal with the way Arabic has been affected 
by contact with other languages: first, in those situations in which Arabic was the 
socially dominant and superimposed language; second, in those situations in which 
Arabic borrowed from other superimposed languages, which were socially non- 
dominant; and third, in those situations in which it was influenced by contact with 
a socially dominant and superimposed language. The next three sections survey 
the reverse situation, the influence of Arabic on other languages: first, the influence 
of Arabic as a community language on a superimposed language; then, the 
influence of Arabic on other languages as a socially dominant and superimposed 
language; and finally, the influence of Arabic on other languages as a nonsocially 
dominant superimposed language. The last section deals with the categories of 
borrowing, at the phonological, morphological and syntactic level. 


1 Substratal Influence in Arabic 


In the course of the Arab conquests of the seventh century CE, Arabic became 
the superimposed language in a large part of the Middle East and North Africa, 
replacing in this function Greek, Latin, and Persian. During this process, it under- 
went the influence of the local community languages spoken in this area, such as 
Aramaic/Syriac, Persian, Coptic, South Arabian, and Berber. In the newly con- 
quered territories, new Arabic vernaculars emerged, whose structure differed con- 
siderably from the Arabic spoken by the Bedouin tribes in the Arabian peninsula 
before Islam. Although there is considerable controversy about the mechanisms 
of change that led to this development, most scholars would probably agree that 
the contact with the indigenous languages in the conquered territories was one 
of the formative factors in the emergence of the new vernaculars. In Thomason 
and Kaufman’s (1988) model of linguistic contacts, such a development is part of 
what they call “substratal influence,” i.e., changes introduced in a language that 
is learnt by second language learners, who eventually abandon their first language 
and shift entirely to the superimposed language. Because the native speakers of 


636 Kees Versteegh 


Arabic were numerically far inferior to the indigenous population, these struc- 
tural changes had a high chance of becoming part of the repertoire of all speakers, 
once the shift to the new language was completed. 

It is not always easy to decide which changes have been caused by substratal 
influence. According to Diem (1979), changes that are attested independently in 
several Arabic dialects, even in areas where the alleged substratal language was 
never spoken at all, cannot be the result of substratal influence. He rejects, for 
instance, the explanation of the loss of interdentals in Cairene Arabic by substratal 
influence, because this occurred in other dialects as well. If substratal influence 
is not accepted, common features in the new vernaculars, for instance their general 
tendency toward analyticity, simplification, and reduction, can only be explained 
as instances of linguistic drift, i.e. as universals of language change. 

In Ferguson’s (1959c) view of the development of Arabic, some of the more specific 
common features are to be explained by a monogenetic model of the origin of 
the vernaculars: he believes that features like the disappearance of the dual, the 
merger of the phonemes /d/ and /d/, and the occurrence of certain lexical items 
(e.g. Saf ‘to see’ instead of Classical Arabic ra’a), all derive from an Arabic koine 
spoken in the military garrison cities that were founded at the beginning of the 
conquests. This koine became the common ancestor of all the modern dialects. 
Somewhat along the same lines, Owens (2006) maintains that the common changes 
in the Arabic vernaculars can be used to reconstruct the ancestral language of the 
modern vernaculars. Unlike Ferguson, he does not regard these common features 
as the result of koineization, but as going back to pre-diaspora Arabic, a form of 
Arabic that existed along with, but was different from the Classical Arabic of the 
Qur'an. 

Yet, it is hard to imagine that the language shift from the indigenous languages 
to Arabic could have taken place without any structural changes in the language. 
In other contact situations, phenomena like the loss of declensional endings, reduc- 
tion in the morphological structure, and changes in word order and in the agree- 
ment system have been ascribed to substratal influence, and it seems reasonable 
to assume that their occurrence in the Arabic dialects is somehow connected to 
the language shift process. Claims of substratal influence in the Arabic vernacu- 
lars vis-a-vis Classical Arabic usually focus on Berber, Coptic, South Arabian, and 
Syriac. Since Berber is still spoken by a sizable proportion of the population of 
the Maghreb, its influence is both substratal and adstratal. Commonly acknow- 
ledged as due to Berber influence are the affrication of /t/ and the use of certain 
nominal patterns. Coptic became extinct in the sixteenth century at the latest, but 
may have been responsible for some features of Egyptian Arabic, for instance, 
the in situ position of the interrogative and the construction of the comparative 
with the preposition ‘an ‘from’ (see Behnstedt 2006). South Arabian languages are 
still spoken by small minorities in Yemen and Oman; some features in Yemeni 
Arabic dialects, such as the plural pattern kitab/kutawwib ‘book/books’ (as 
against Classical Arabic kitab/kutub) and the perfect with a k affix katabk ‘T wrote’ 
(as against Classical Arabic katabtu) are often ascribed to the influence of these 
languages. Neo-Aramaic/Syriac is still spoken by minority groups in Turkey, Syria, 
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Iran, and Iraq. It goes back to Aramaic as the old cultural language and lingua 
franca of the Hellenized world in the last centuries before and the first centuries 
of the common era. One of the features commonly ascribed to Aramaic/Syriac 
substratal influence is the deletion of the short vowel /a/ in open unstressed 
syllables, which is found in North Lebanese dialects in the same region where 
Neo-Aramaic dialects are spoken (for language contacts between Arabic and 
Aramaic/Syriac in Syria see Arnold & Behnstedt 1993). 

Lexical influence from the substratal languages is, as expected, rather small. 
According to Thomason and Kaufman (1988: 39), in a situation of language shift, 
where substratal influence operates, lexical items hardly ever make it into the super- 
imposed language. The reason behind this is that it makes little sense for people 
shifting to another language to use lexical items from their own language, which 
are not understood by the speakers of the superimposed language anyway. Items 
related to local flora and fauna and popular culture form an exception here. Thus, 
one finds in Egyptian Arabic words like timsah ‘crocodile’ (< Coptic ti-msah with 
Coptic feminine article), in Moroccan Arabic azeffan ‘lobster’ (< Berber azeffan), and 
in Lebanese Arabic massan ‘extension of plough handle’ (< Neo-Aramaic of 
Ma’‘lila mason). 

An extreme case of substratal influence through language shift is found when 
a language is pidginized and subsequently creolized. These processes affected Arabic 
predominantly in Africa. In nineteenth-century Sudan, recruits speaking differ- 
ent African languages (Nuer, Dinka, Shilluk, Nubian) were brought together in 
military camps built by the Anglo-Egyptian army in southern Egypt. Their com- 
mon means of communication was an Arabic pidgin, called Juba Arabic. After 
the Mahdist revolt, many of these soldiers, most of them Nubians, migrated 
to Uganda and Kenya, where the language became creolized under the name 
Ki-Nubi (see Wellens 2005). Elsewhere in sub-Saharan Africa, the use of Arabic 
as a lingua franca in trading relations between people speaking different languages 
led to the emergence of trade jargons and pidgins, for instance Bongor Arabic in 
Chad (see Luffin 2008). There are some indications that such contact languages 
have been in existence for a long time (see Thomason & Elgibali 1986). 


2 Borrowing from Other Prestigious Languages 


Even before Islam, speakers of Arabic had been in contact with speakers of other 
languages at the periphery of the Arabian Peninsula. These languages acted as 
superimposed languages of culture, without being socially dominant. Old loan- 
words from Syriac (e.g. salib < slab ‘to crucify’), Latin (e.g. dinar ‘gold coin’ < denarius, 
sirat ‘path’ < strata), and Greek (e.g. fulk ‘ship’ < ephdlkion) are attested in the 
earliest preserved texts, the pre-Islamic poems and the Qur’an. According to some 
scholars, such central Islamic notions as salat ‘prayer’ (< Syriac slota), and even 
the word Qur'an itself (< Syriac geryana) derive from the Syriac Christian tradi- 
tion in the Near East. In the Islamic tradition, a controversy soon arose about 
the possibility of such loanwords in the language of the Qur’an. Early Muslim 
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exegetes saw no problem in acknowledging the foreign origin of some of these 
terms, but generally speaking, later philologists and theologians did not accept 
these etymological sources. 

Once Classical Arabic was established as the new administrative, cultural, and 
religious language of the Islamic empire, its status as a superimposed language 
in this area did not make it impervious to the influence of other cultures. From 
an early period onward, languages like Persian and Greek provided Arabic with 
a large number of loanwords. These were predominantly transmitted through the 
medium of translation, especially in the case of Greek, from which hundreds of 
scientific, philosophical, and medical texts were translated in the course of the 
ninth century. Persian loanwords are often from the domains of administration, 
plants and herbs, and architecture (see Asbaghi 1988). Many of them were inte- 
grated in the root-and-pattern system of Arabic, e.g. ‘ustad, plural ‘asatida ‘teacher, 
master’ (< Persian ostad) or barnamaj, plural baramij ‘program’ (< Middle Persian 
barnamak). Greek loanwords or loan translations are very often philosophical or 
medical in nature. Some of these loanwords were treated in the same way as Persian 
loanwords, e.g. faylasiif (< Greek phildsophos), which became the point of depar- 
ture for further derivations (plural falasifa, verb falsafa ‘to philosophize’). But the 
majority of Greek words were translated in the form of calques, whether in logic 
(mawdi’ ‘subject’ from the verb wada‘a ‘to place’, on the basis of Greek hupokei- 
menon), or in medicine (Sabakiyya ‘retina’ from Sabaka ‘net’, on the basis of Greek 
amphiblestroeideés chiton ‘net-like cloak; retina’). 

In Modern Arabic, both integration and loan translation are still found in 
borrowing. In Modern Standard Arabic, partly due to the purism of the Arabic 
Language Academies, foreign words are often replaced with neologisms, which 
in their turn are more often than not loan translations of the foreign example. 
This already applied to those words which were borrowed from Turkish at the 
time when the Arab countries were provinces of the Ottoman empire (e.g. jamraka 
‘to take toll’ < Turkish giimriik). It also applies to the many loanwords from English 
and French that entered the language in the nineteenth century, especially in 
the domain of political terminology, e.g., qawmiyya ‘nationalism’ (from qawm 
‘people’), istirakiyya ‘socialism’ (from i8tirak ‘companionship’). Because of the 
nature of French colonial policy, interference from French was particularly strong 
in the Arabic spoken in the Maghreb. In the standard language, French loanwords 
were never very frequent (most intellectuals preferring to write in French, any- 
way), but there is a high degree of stylistic interference, for instance in an expres- 
sion like wada‘a fi l-isti‘mal ‘to put to use’ (derived from French mettre en usage) 
or the use of huqiiq ‘rights’ in the sense of ‘fees’ (like French droits). Arabic dialects, 
on the other hand, often integrated loanwords from the colonial languages into 
their morphological structure, in particular the dialects of the Maghreb, where 
French and Arabic coexisted for a long time in a context of code-mixing (see 
Heath 1989). 

In the globalized world of the twentieth century, especially in the language of 
the (electronic) media, contact with English has led to the introduction of a large 
number of loanwords in the standard language. Yet, even in such a conspicuously 
modern domain as computer terminology, Arabic terms are gradually replacing 
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the foreign loans. The word hasib ‘computer’, for instance, has become the official 
word for what used to be called kumbyiitar. Stylistic influence is manifest, however, 
especially in the language of the media, where expressions like qatala I-waqt ‘to 
kill time’ or /a‘iba dawran ‘to play a role’ are current. An interesting development is 
the formation of prefixes like qab ‘pre-’ (< qabla ‘before’) or Sibh- ‘pseudo-’ for the 
translation of scientific words in standard Arabic, e.g., gab-tarixt ‘prehistoric’, sibh- 
jazira ‘peninsula’; the use of prefixes is almost entirely foreign to the morphological 
structure of Arabic. A number of syntactic features may be the result of interfer- 
ence from English and/or French, e.g. the conjunction ma’ida, which may have 
originated as a device to translate the English whether, or the idea of reciprocity 
expressed by ba‘duhum al-ba‘d to translate English each other. Word order, too, may 
have been affected by the contact with English in the media, for instance in the 
use of sentences starting with an indefinite subject, especially in headlines. 


3 Arabic in the Diaspora 


Outside the Arab world, Arabic is spoken as a minority language in the so-called 
language islands, in Uzbekistan, Afghanistan, Anatolia, Khuzestan, Cyprus, where 
other languages (Uzbek, Pashto, Turkish, Farsi, and Greek, respectively) function 
both as the official language and the socially dominant and superimposed language. 
In these communities, virtually all speakers are at least bilingual in their Arabic 
vernacular and in the official language of the country they live in. The interference 
of the official language differs from country to country and is probably heaviest 
in Uzbekistan and Cyprus. In Uzbekistan Arabic, even the word order has changed 
from VSO to SOV under the influence of Uzbek (see Versteegh 1984-6). In Cyprus, 
code-mixing between Cypriot Arabic and (Cypriot) Greek has reached a stage 
where speakers have started to introduce inflected Greek verbs in their Arabic 
(see Borg 1985). 

Malta is a special case. After the initial conquest of the island by speakers of 
Arabic from North Africa in 870, the island became Arabic-speaking, possibly after 
having been repopulated from Sicily. The successive domination of Italians and 
English has led to a complete restructuring of the language. Nowadays, Maltese 
is the official language of Malta and the only Arabic vernacular to have become 
a standard language, written in Latin script. Through the contact with Italian and 
Sicilian, the lexicon of Maltese is replete with Romance items, and the influx of 
Italian words has even led to a partial abandonment of the nonconcatenative 
morphology (see Mifsud 1995), so that the original triradical structure of the 
lexicon has become obscured. Broken plural patterns are still applied to some 
of these loanwords, but in the form of reduplication at the end of the stem, e.g., 
umbrella, plural umbrelel; gverra ‘war’, plural gverer. Italian verbs have been 
integrated in such numbers that the originally weak conjugation of Arabic now 
has become the predominant form of the verb, e.g. salva, imperfect jsalva ‘to save’ 
(< Italian salvare), solva, imperfect jsolvi ‘to solve’ (< Italian solvere). 

Most of the language islands were established at an early date, but the real Arab 
diaspora dates from a later period and was directed elsewhere, when large numbers 
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of speakers of Arabic emigrated to the Americas and to Europe. Immigration to 
the Americas (Rouchdy 1992) started as early as the nineteenth century, especially 
from the Levant, and has created large communities of Arabic speakers in the 
Americas, for instance, that of the Lebanese in Brazil. These immigrants were mainly 
middle-class speakers who soon became bilingual in Arabic and the language of 
their new country (Spanish/Portuguese or English). Although the new language 
has become the socially dominant language for the new generations, there is a 
high degree of language maintenance and most speakers are able to freely mix 
the languages. As a result, many loanwords have been integrated in the community 
language of these speakers and even verbs are borrowed (see Nabhan 1989), e.g. 
nawmar, imperfect bi-nawmir ‘to go on a date’ (< Portuguese namorar). 

In Western Europe, speakers of Arabic from the Maghreb started to arrive from 
the 1970s onwards (Boumans 1998). These immigrants were mostly unskilled work- 
ers; originally, they meant to return to their homeland after having worked for 
a limited period of time, and they were not particularly motivated to learn the 
language. In the course of time, it turned out that most of them were there to 
stay. As a rule, new generations did not preserve the language of their parents, 
and their heavy code-mixing Arabic-French, Arabic-German, or Arabic—Dutch is 
often the last stage before a complete shift. 


4 Arabic Substratal Influence on Superimposed 
Languages 


Those migrant speakers of Arabic who shift toward the dominant language of 
their new country may develop a special form of the language they have shifted to, 
which may become their new community language. In extreme circumstances, this 
substratal influence may lead to the emergence of pidginized or even creolized 
forms of these languages, but in the concrete case of immigration to Europe, even 
though there is a certain amount of isolation between the immigrant population 
in the suburbs and the population of the host country, the situation is not likely 
to lead to the emergence of real ethnic varieties (ethnolects) of French, Dutch, 
English, or German. There are, however, indications that in some countries, the 
end product of the process of language shift will be a sociolect that is vaguely 
associated with the immigrant communities, or with parts of them. In the 
Netherlands, Moroccan Dutch has a certain notoriety as a street language, which 
has even become popular with new speakers of Dutch from other communities, 
such as Surinamese or Turkish youngsters. The special character of this street 
language seems to be confined to a characteristic accent and to lexical items. 


5 Arabic and Minority Languages 


Borrowing from Arabic as the socially dominant superimposed language took place 
on a large scale in all the indigenous languages in the newly conquered territories. 
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After the conquests in the seventh century, not all speakers of indigenous languages 
shifted to Arabic. Even today, varieties of Aramaic are still spoken by more than 
300,000 people in Iran, Turkey, Lebanon, Syria, and Iraq. According to some 
estimates, at least 40 percent of the population of Morocco is Berber-speaking. 
Most of these speakers are bilingual in their community language and Arabic. As 
a result, their home language is under constant pressure from the official language. 

The official attitude toward the minority languages does not tend to be very 
tolerant. Even apart from the prestige of the Arabic as the official language and 
the language of the Qur’an, most Arab countries carried out a policy of Arabization 
after gaining independence from colonial domination, and stipulated in their con- 
stitution that their countries were unilingual. Some minority languages have barely 
managed to survive so far, for instance the Modern South Arabian languages (Mehri, 
Harsusi, Sogotri, etc.) in Yemen and Oman, which are spoken by some 200,000 
people, but seem to be losing ground. 

Berber or Tamazight is a special case, however. In the last few decades a change 
in attitude toward Berber on the part of the authorities has become noticeable. 
At present, in some Maghreb countries, Berber may be used in the media and 
has even been allowed to become part of the school curriculum. Nonetheless, there 
is a tendency toward language shift in the younger generations because of the 
status of Arabic as the official language (see El Kirat 2007), and even for those 
speakers who manage to maintain their home language, there is a large degree 
of interference from Arabic, both structurally and in the lexicon. 

As a related Afro-Asiatic language, Berber shares some features with Arabic, 
especially in its root structure, which makes it easier to integrate loanwords from 
Arabic into its structure, e.g. ahbib ‘friend’, plural ibiban (< Moroccan Arabic hbib), 
with a masculine prefix, and tansalmt, plural tinsalmin ‘Muslim [fem.]’ (< Moroccan 
Arabic msalma) with a feminine prefix. This applies even more to related Semitic 
languages, like Neo-Aramaic and the South Arabian languages, which have the 
same nonconcatenative structure as Arabic. Loanwords, even loan verbs, can be 
integrated fairly easily in these languages. Thus, in Neo-Aramaic one finds Arabic 
verbs in derived verbal patterns that correspond to Arabic patterns, e.g. in’fzar 
‘to explode’ (< Arabic infajar, pattern VII); scakbel ‘to accept’ (< Arabic istaqbal, 
pattern X) (Arnold 2007), and in South Arabian Harsusi Arabic verbal patterns 
have been introduced as variants of the indigenous patterns (see Lonnet 2008), 
so that gdtma, egtoma ‘to gather’ are used along with the Arabic loan verb with 
the same meaning agtamd’ (< Arabic ijtama’). 

Throughout history, other languages, now defunct, must have gone through a 
similar process and before they died out as community languages, i.e. they were 
affected structurally by Arabic. The last speakers of Coptic as a community 
language probably died in the sixteenth century, and in the period between the 
conquest and the final extinction of the language many Arabic loanwords became 
current in Coptic treatises, and presumably in the spoken language as well. 

In the Iberian peninsula, Arabic and Romance coexisted in the period between 
711 and 1492. It is not entirely clear to what extent both the Muslim and the Christian 
population were bilingual, but the popularity of bilingual poetry (the so-called 
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jarchas) demonstrates that at least some layers of society were conversant in both 
languages (Zwartjes 1997). The mutual influence is manifest through the large 
number of Arabic loanwords in Spanish, Catalan, and Portuguese, and inversely, 
in the Romance loanwords in Andalusian Arabic (Corriente 1994). 

In Persia, the linguistic situation developed in an unexpected way. After the 
initial conquest in the seventh century, Middle Persian (Pahlavi) must have gone 
through the same process of gradual language shift as was the case in the other 
conquered territories, although a sizable part of the population in provinces like 
Khorasan, and in new garrison cities like Basra and Kufa, remained bilingual. 
In the ninth and tenth centuries, Dari, a peripheral dialect of Middle Persian, 
was re-introduced as the new language of court and state and it became the new 
community language of Persia under the name Farsi. Henceforth, Arabic—Persian 
bilingualism was restricted to scholars, and Arabic became a learned language 
that ordinary people only heard in the mosque during Qur’an recitations. Arabic 
remained an important superimposed language, but it was no longer a language 
ordinary people were exposed to in their daily lives. 


6 Arabic as Language of Trade and Religion 


Outside the Arabic-speaking world, wherever Arabic-speaking traders or mis- 
sionaries ventured, Arabic was a superimposed language, albeit not a socially dom- 
inant one. It was widely used in Africa as the language of trade and frequently 
served as a second language for people in their contacts with Arabic-speaking 
traders, and even in their dealings with people speaking other languages. In West 
Africa, Arabic was an important lingua franca even before the spreading of 
Islam, as it was the language of official correspondence between the various West 
African states. Later on, it became the language of Islamic learning in centers 
such as Timbuktu and Djenné. Indigenous languages were heavily influenced, 
especially in their lexicon. This applies to other lingua francas, such as Hausa, 
Fulfulde, and Kanuri, but also to more local languages, such as Yoruba and Bambara. 

In East Africa, Swahili had already been established as a lingua franca when 
Arab traders arrived and it was never replaced by Arabic. Yet, even the name 
of the language (< Arabic sawahil ‘coasts’) betrays the influence of Arabic. During 
the Omani sultanate of Zanzibar, which served as one of the main markets for 
the slave trade, Arabic was the official language of the island. Many of the slaves 
probably communicated in a pidginized form of Arabic, which may have been the 
source for some of the Arabic loanwords in Swahili. When Swahili eventually 
became the official language on the East African coast, its standard form was 
written with Arabic script. From that time onward, borrowing from Arabic took 
place through written transmission by Islamic scholars. 

Outside Africa, trade relations followed the trade winds and brought Arab traders, 
in particular those from Hadramaut, as far east as the Indonesian Archipelago. 
In the Indian Ocean trade, they first contacted speakers of Dravidian languages, 
such as Telugu, Malayalam, and Tamil, some of whom converted to Islam and 
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formed Muslim communities. All of these languages were affected by the contact 
with Arabic; in some languages, a special literature arose, written in Arabic script 
and full of Arabic loanwords, for instance the Arwi literature in Tamil (Tschacher 
2001) or the Mappilappattu poetry in Malayalam. 

Even when no direct contact with speakers of Arabic took place, the language 
still continued to exercise influence as the gateway to Islamic learning. Persian 
missionaries carried out the Islamic mission to South and Southeast Asia, and as 
a result, most people in Asia came to learn Islam through speakers of Persian, 
who used the local lingua franca, such as Malay or Chinese, for their teaching. 
In these regions, Arabic remained a written language, the language of the Islamic 
revelation. Some of the indigenous scholars went to study in the holy cities of 
Islam, and often stayed there for years, gaining enough proficiency in the written 
language to read and sometimes even write theological treatises in Arabic. When 
they started to use their own language for this purpose, these scholars introduced 
large numbers of Arabic loanwords into their own language. In this way, the 
languages of South and Southeast Asia, in particular Urdu/Hindustani and 
Malay, received their Arabic-derived lexicon. Even a language like Malay, far away 
from the Arab homeland, was written with Arabic script and became replete with 
Arabic technical terms. In Central Asia and Anatolia, too, Islam and the Arabic 
script were spread from Persia, and most of the Arabic-derived lexicon reached 
the Turkic languages through Persian. 

The diversity of the contacts is reflected in the layering of loanwords from Arabic. 
In African languages, for instance, where written transmission was preceded by 
personal contacts with Arab traders, there is an older layer of Arabic loanwords 
that were borrowed at an early stage. Very often, these are no longer recogniz- 
able as loanwords from Arabic, since they have become integrated in the lexicon 
completely, both phonologically and morphologically. In Fulfulde, for instance, 
Arabic nouns are integrated in the nominal class system and subjected to initial 
consonant alternation in the plural, just like Fulfulde nouns (see Theil 2007), e.g. 
keefee-ro, pl. heefeerbe ‘unbeliever’ (< Arabic kafir). At a later stage, when Islamic 
education reached these communities, and more people became acquainted with 
Arabic as a learned language, some of these original loanwords were re-Arabized, 
so that nowadays Fulfulde-speaking Islamic scholars tend to pronounce the 
Arabic phonemes /d/ and /z/, rather than using the Fulfulde approximation /j/ 
(e.g. dikru instead of jikru < Arabic dikr ‘mention [of God’s name]’). 

In principle, all Islamic languages exhibit a similar layering, even Persian and 
Turkish. When Persian had been reinstated as the language of the state, borrowing 
of loanwords took place predominantly through written transmission by scholars; 
this new layer of loanwords came on top of those that had been introduced after 
the initial conquest. According to Perry (1991), the layering of loanwords in Persian 
is reflected, for instance, in the two shapes of the feminine ending, -e and -at: older 
loanwords borrowed the Arabic feminine ending in the colloquial form -a > -e, 
whereas in later loanwords the written form -at became predominant. This is also 
demonstrated by the semantics of these words, those in -e usually being more 
concrete, whereas the -at ending is usually found in words that are more abstract 


644 Kees Versteegh 


in meaning (e.g. baladiye ‘town council’ as against baladiyat ‘expertise’, both < Arabic 
baladiyya). 

Even in Malaysia and Indonesia, there may have been an earlier period in which 
loanwords entered the language through a different route than that of the written 
transmission of later periods. In some cases, Arabic nouns were apparently 
borrowed again at a later stage. The modal expression perlu ‘ought to’ is an early 
loanword from Arabic fard ‘legal duty’; the Arabic word was borrowed again later 
as fardu ‘religious duty’. Not even Islamic scholars know that the former derives 
from Arabic, whereas the latter is commonly recognized as an Arabic word. 

Some languages have simultaneously been in contact with Arabic in two dif- 
ferent contexts. In the Western Sudan, Hausa participated in the same partial bilin- 
gualism as other West African languages. It borrowed from Arabic in the same 
way as other African languages (see Greenberg 1949), fully integrating words like 
litaafi ‘book’ (< Arabic al-kitab ‘the book’, with reanalysis of the definite article). 
Because of its function as a lingua franca in this area, it even became responsible 
for the spread of Arabic loanwords to other languages. But in the Eastern Sudan, 
a large Hausa-speaking group became fully bilingual in daily life with Arabic 
as the second language. Here, Hausa—Arabic bilingualism has led to extensive 
code-mixing (see Abu Manga 1999). 

A special case is the influence of Arabic in Madagascar. According to the local 
tradition, Arabic was brought here by immigrants from Mecca, at the time of the 
Prophet Muhammad. Be that as it may, Arabic script was used for a voluminous 
literature in a mixture of Arabic and the local Malayo-Polynesian language, 
Malegasy (sorabe). One clan on the island continued, even after the introduction 
of Christianity, to use a secret language that contained a large number of Arabic 
words and that may have been preserved until the present day (see Rajaonari- 
manana 1990; Versteegh 2001). 

The influence of Arabic as a superimposed language is not restricted to the 
Islamic world. In Europe, the cultural and scientific superiority of Arabic during 
the Middle Ages, especially in al-Andalus, led to the translation of numerous Arabic 
treatises into Latin, and the import of many scholarly terms in a number of 
sciences (see Cannon 1994), e.g. English algebra (< al-jabr ‘breaking [of equations 
with two unknowns)]’), zenith (< simt), but also in other domains, e.g. arsenal 
(< dar as-sina’ ‘house of weapons’), admiral (< ‘amir al-bahr ‘commander of the sea’). 
The impact of Arabic mathematics is seen in words like algorithm (from al- 
Khwarizmi, a famous mathematician from the ninth century) and in the words 
cipher and zero, both from Arabic sifr ‘zero’, introduced when the Arabic num- 
bers were taken over in Europe. In the languages of the Iberian peninsula the 
cohabitation with Arabic during the centuries of Arabic occupation led to an enor- 
mous influx of Arabic loanwords in these languages in all domains of daily life, 
e.g. Spanish alcalde ‘mayor’ (< al-qadi ‘the judge’, with reanalysis of the definite 
article), fulano ‘so-and-so’ (< fulan), azticar ‘sugar’ (< as-sukkar). Trade relations 
in the Mediterranean and contacts with Italian traders from Venice and Genua, 
through which luxury goods were imported, led to a different route of import- 
ing loanwords as reflected by such loanwords as Italian melanzana ‘aubergine’ 
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(< badinjan), zucchero ‘sugar’ (< sukkar), magazzino ‘store house’ (< maxzin), and dogana 
‘customs’ (< diwan). 


7 Categories of Borrowing 


In those contexts in which Arabic is the socially dominant language, borrowing 
takes place from vernacular Arabic; in all other cases, especially when borrow- 
ing takes place through written transmission, Standard Arabic is the language from 
which elements are borrowed. 


7.1 Phonological interference 


During the process of language shift in which the new Arabic vernaculars emerged, 
phonological restructuring was pervasive. In the Maghreb dialects, the substratal 
influence from Berber must have been very strong, leading to accent shift, reduc- 
tion of the short vowels, and elision of nonstressed vowels — features that are also 
found in Berber. 

In most cases, loanwords are adapted to the phonological system of the bor- 
rowing language; thus, Arabic pharyngeals and laryngeals often merge in many 
of the languages that borrow from Arabic. In the older loanwords in Hausa, for 
instance, /h/, /h/, /’/ and /’/ have merged. Islamic scholars often attempt to 
approximate the Arabic pronunciation by introducing foreign phonemes. Thus, 
in Swahili dh, th occur exclusively in modern loanwords from Arabic, and are only 
distinguished from d, t by those who wish to show their Islamic learning. 

In the same way, new phonemes have been introduced in Arabic when this 
language was at the borrowing end, especially in the eastern Arabic dialects, for 
example /p/, /¢/, from Turkish and Persian. More extreme cases are found in 
the Mauritanian Hassaniyya dialect, which has developed such phonemes as /n’/, 
/d¥/, /t’/, borrowed from Zenaga Berber. 


7.2 Loan nouns 


It seems to be generally true of all contact processes that the overwhelming 
majority of loans are nominal in nature. Moravcsik (1975) even goes so far as to 
maintain that verbs always have to be nominalized before they can be used in 
the borrowing language. In the case of Arabic, its nonconcatenative morphology 
complicates borrowing even in the case of nouns, both from and into Arabic. 
Nonetheless, nominal loans in Arabic are sometimes integrated into the 
morphological structure, developing their own broken plurals. This applies both 
to older loanwords from Persian (e.g. ‘ibriq ‘water jug’, plural ‘abariq < Persian 
abréz), and to modern loanwords from English (e.g. film, plural ‘aflam; bank, 
plural bunik; duktir, plural dakatira). Arabic dialects are much more adept at 
integrating foreign nouns, especially in the case of French nouns in the Maghreb 
dialects (see Heath 1989), e.g. Moroccan Arabic babur, plural bwabr ‘ferry’ (< French 
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vapeur), dusi, plural dwasi ‘file’ (< French dossier). Very often, however, foreign nouns 
are pluralized by an affix, e.g. Egyptian Arabic sandwitS, plural sandwitSat ‘sand- 
wich’, makina, plural makinat ‘machine’. 

Most of the loans from Arabic in Islamic languages are religious in nature, 
but certainly not exclusively. There are many cultural loanwords, related to the 
culture of writing, such as the word for ‘pen’ (Arabic qalam), but also to items 
of trading, e.g. the words for ‘soap’ (Arabic sabiin), and for various fruits and 
vegetables, as well as luxury items. But the main part of the borrowed lexicon 
pertains to the domain of religion. Recitation of the Qur’an in the mosques and 
as part of the early training of children in reading and writing certainly contributes 
to the spread of Arabic loanwords. Again, Persian is a special case because some 
of the basic religious terms are not Arabic, but Persian: most Iranian mollahs use 
Xoda rather than Arabic Allah. 


7.3 Loan verbs 


In very early situations of linguistic contact, verbs are used freely, presumably 
because the speakers are exposed to very simple speech acts such as wishes and 
commands, for which only one verb form suffices, usually the imperative. This 
is indeed the form that seems to be at the basis of the verbal system in Arabic 
pidgins and creoles, such as Ki-Nubi or Juba Arabic (see Wellens 2005: 331ff.), 
e.g. Ki-Nubi askutu ‘to be silent’ (< Arabic sakata, imperative uskut), robutu ‘to bind’ 
(< Arabic rabata, imperative urbut), asharabu ‘to drink’ (< Arabic sariba, imperative 
iSrab). When this primitive form of communication is expanded, more fine-grained 
distinctions become important, and aspectual distinctions are introduced, based 
on material from the target language, for instance Ki-Nubi aspectual markers like 
gi- for the continuous tense, which ultimately seems to derive from a form gaid 
(< Arabic qa‘id ‘sitting’). 

In a number of languages, the form of some of the verbal loans suggests that 
they go back to an older layer, possibly a pidginizing stage, before borrowing via 
written transmission commenced. In Fulfulde, the form of the loan verbs suggests 
that they derive from Arabic imperatives, although Theil (2007) believes that they 
go back to imperfects, e.g. tuuba ‘to repent’ (< Arabic taba, imperative tiib), faama 
‘to understand’ (< Arabic fahima, imperative ifham), dursa ‘to know by heart, to 
recite’ (< Arabic darasa, imperative udrus). In Swahili, too, some of the verbal forms 
may derive from an Arabic imperative, although the evidence is not conclusive 
(see Tucker 1946-7). The direct borrowing of verbs is clearly shown by such word 
pairs as salamu ‘greeting’ (< Arabic salam) and salimi ‘to greet’ (< Arabic sallama, 
imperative sallim). 

Outside the context of pidginization, three different options exist for languages 
to integrate verbal loans. In the first place, verbs can be integrated morphologic- 
ally. This is what happens with Arabic loan verbs in languages with a related 
nonconcatenative structure, such as Neo-Aramaic and Modern South Arabian 
(see above). Inversely, Arabic dialects freely integrate verbal forms from related 
languages, such as Ivrit (see Amara 2007), e.g. yitabbal ‘to take care’ (< Ivrit yitapel), 
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yiSaxbel ‘to duplicate’ (< Ivrit yiSaxpel). This occurs most frequently in the dialects 
of the Maghreb, in which French and Spanish verbs are often integrated morpho- 
logically (see Heath 1989), e.g. diklara, imperfect ydiklari ‘to declare’ (< French déclarer), 
ramasa, imperfect yramasi ‘to collect’ (< French ramasser). In the Gulf dialects, too, 
recent English loan verbs have been attested (see Smeaton 1973). In Maltese, all 
Italian and English verbs are integrated morphologically. 

In the second place, verbs can be derived from nominal loans; some languages 
have their own morphological devices to derive verbs from borrowed nouns. In 
Swahili, for instance, the borrowed noun shughuli ‘business, occupation’ (< Arabic 
Sugl) is the source for the verbs shughulika ‘to be busy’, shughulisha ‘to occupy, keep 
busy’. In Malay/Indonesian, all borrowings from Arabic are nominal in nature, 
but, just like Malay nouns, these borrowed nouns serve as the basis for verbal 
derivation, e.g. from akhir ‘last; end, finish’ (< Arabic ‘axir) berakhir ‘to end in, to lead 
to’, mengakhiri ‘to put an end to, to finish’, mengakhirkan ‘to postpone’. Denominal 
verbs from foreign nouns are common in standard Arabic, too, e.g. ‘aksada ‘to oxy- 
dize’, nakkala ‘to nickel’, talfana ‘to make a telephone call’. 

The third strategy for borrowing verbal concepts from foreign material consists 
in the use of “light verbs” to accommodate foreign verbs, either by connecting 
the foreign verb with an auxiliary ‘to do’, or, more commonly, by combining the 
verb ‘to do’ with a (verbal) noun. Sometimes, there are two light verbs, one of them 
meaning ‘to do’, used with agentive verbs, and one of them meaning ‘to be’, used 
with stative verbs. In Muysken’s (2000) typology of code-mixing, the two strat- 
egies represent quite different forms of borrowing: integrating foreign verbs is a 
form of insertion, whereas the use of a light verb with a foreign verb constitutes 
alternation. Insertion is used by speakers who have insufficient knowledge of the 
foreign language, for instance in a colonial context. In situations where speakers 
freely alternate between two languages, for instance in a migration context, light 
verbs are the preferred device to accommodate verbs from the socially dominant 
language (Gardner-Chloros & Edwards 2004; Boumans 2007). Examples of this 
use of light verbs are found in Moroccan Arabic-Dutch code-mixing with the verb 
dar ‘to do’ (Boumans 1998), and in Arabic—English code-mixing with the verb ‘amal, 
which has the same meaning (Othman 2006). In all of these cases the light verb 
is constructed with a foreign verbal form, e.g., dert-hum ontmoeten ‘I met them’ 
[lit. ‘I did them meet’] (with a Dutch infinitive). 

This contrasts with the use of light verbs in Arabic loanwords in Persian, Turkish, 
and Urdu, where the light verb is constructed with either a verbal noun or a noun 
with a highly verbal content, e.g. Persian ta’ajjub kardan ‘to be amazed’ (< Arabic 
ta‘ajjub ‘amazement’); Turkish tesir etmek ‘to influence’ (< Arabic ta’tir ‘influence’); 
Urdu inkar karna ‘to deny’ (< Arabic ‘inkar ‘denial’). Here, no stable bilingualism 
exists and borrowing takes place almost exclusively through written transmission 
by Islamic scholars, who usually know only standard Arabic. Ordinary speakers 
are not exposed to Arabic in their daily life, and they borrow these light verb 
constructions from the texts. In the case of the speakers of Mardin Arabic in Anatolia, 
who use the light verb sawa ‘to do’ with Turkish and Kurdish nouns (see Grigore 
2007: 157-9), and who are completely bilingual, written transmission is of course 
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out of the question. Presumably, they have borrowed the entire construction from 
Turkish. 

In some code-mixing situations, inflected verbs are embedded in the community 
language, as happens in Hausa—Arabic code-switching (Abu Manga 1999) and in 
Cypriot Arabic-Greek code-switching (Borg 1985). This is probably the final stage 
in the borrowing process, before a complete shift to the other language. 


74 Borrowing of functional elements 


Functional elements are commonly held to be the most difficult elements to 
borrow, although they are actually borrowed quite frequently. In written trans- 
mission (but not exclusively), Arabic words are used to form new functional 
elements, especially compound prepositions and conjunctions, e.g. Persian va ‘and’ 
(< Arabic wa-, also in Turkish ve), amma ‘but’ (< Arabic ‘amma), vagtike (< Arabic 
wagt ‘time’ + Persian ke). In Malay, conjunctions derived from Arabic are common, 
e.g. lau ‘if’ (< Arabic law), waktu ‘when’ (< Arabic wagt ‘time’), oleh sebab ‘because’ 
(< Malay oleh + Arabic sabab ‘reason’). In many African languages, Arabic nouns 
and conjunctions have been grammaticalized as prepositions or conjunctions, e.g. 
Fulfulde sebi, saabi, sabab ‘because’ (< Arabic sabab ‘reason’); Swahili ao ‘or’ (< Arabic 
‘aw), sababu ‘because’, Hausa lakin ‘but’ (< Arabic lakin). 

Numerals are a special category of functional words; Arabic numerals have been 
borrowed in a number of African languages, presumably via trade relations. In 
Swahili, for instance, the numerals 6, 7, and 9, as well as those from 20 to 90, and 
the words for 100 and 1,000 derive from Arabic. 

An exceptional case is the borrowing of the pronouns for the first and second 
person (ane and ante) from Arabic in the Indonesian spoken in Jakarta. The motiv- 
ation here is obvious: using Arabic pronouns enables the speakers to commu- 
nicate without having to take into account the complicated rules for addressing 
people in Malay. 

It is difficult to classify the ideophones that have been borrowed in some of the 
West-African varieties of Arabic, as in Nigerian Arabic (Bornu). These ideophones 
have the same grammaticalized function as in the neighboring languages Kanuri, 
Fulfulde, and Hausa (see Owens 2004), e.g. co, which always occurs with the adjec- 
tive ‘hot’, or cil, which always accompanies words meaning ‘black’. No other Arabic 
dialects have developed such ideophones. 


7.5 Syntactic interference 


Syntactic interference is particularly strong in the case of substratal influence. 
In those dialects that are spoken as a minority language in a linguistic enclave, 
syntactic interference is very intense. In Uzbekistan, practically all speakers of 
Uzbekistan Arabic (a few thousand at most) are bilingual in Uzbek and Uzbekistan 
Arabic. The influence of Uzbek has led to a change in the canonical word order, 
which has become SOV. In line with this, other constituents also have received 
a new order, and the language has developed postpositions (see Versteegh 
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1984-6). In the Anatolian Arabic dialects, influence from Turkish has not gone 
so far, but still the Arabic dialects spoken there have undergone profound struc- 
tural changes. In Mardin Arabic, for instance, one notes the frequent use of an 
adhortative particle da with the imperative, the productive use of nominal 
suffixes such as -siz, e.g. “agal-siz ‘stupid’ (Arabic ‘aq! ‘intelligence’ + Turkish szz), 
which has become the general Turkish word akilsiz ‘unreasonable, foolish’, and 
the use of Turkish conjunctions such as the Turkish complementizer ki (see 
Grigore 2007). 


8 Conclusion 


As the native language of more than 200 million speakers, and also the religious 
language of more than 800 million Muslims, Arabic clearly belongs to the group 
of world languages. At the same time, the electronic age has not left it unaffected. 
Like all other languages, Arabic cannot escape the global influence of English. 
Even the barrier of the Arabic script does not seem to be as impregnable as before, 
because in texting and chatting young people everywhere abandon it for a crude 
transcription in Latin script, mixing their messages with cool expressions taken 
from English. At the same time, the influence of Arabic is widespread, and words 
like ayatollah, sharia, jihad, which used to be the domain of scholars in Islamic 
studies, have now become household words all over the world through the 
media coverage of current events. More importantly, even in the diaspora, where 
language shift and language attrition mark the language proficiency of younger 
generations, many young people are making an effort to maintain or reclaim their 
linguistic roots, or simply to devote themselves to the study of Arabic as the 
language of their religion. 
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32 ‘Turkic Language Contacts 
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Turkic has a vast area of distribution. It extends from the southwest with Turkey 
and her neighbors, to the southeast, to eastern Turkistan and farther into China. 
From here it stretches to the northeast, via South and North Siberia up to the Arctic 
Ocean, and finally back to the northwest, across West Siberia and East Europe. 
The area comprises a great number of languages. Regions in which Turkic is 
spoken include Anatolia, Azerbaijan, the Caucasus region, Iran, Iraq, Afghanistan, 
West and East Turkistan, South, North and West Siberia, and the Volga region. 
In the past, the Turkic-speaking world also included enclaves in the Ponto-Caspian 
steppes, the Crimea, the Balkans, etc. 

Turkic offers particularly rich sources of data for the study of language contact. 
The continuous and massive displacements of Turkic-speaking groups through- 
out their history have led to numerous new configurations of various kinds. 


1 Intrafamily Contacts 


In one kind of areal contact situation, varieties of Turkic have encountered and 
influenced each other. This was the normal situation in the old tribal confederations 
with their mobile heterogeneous groups. The encounters led to the emergence of 
modified varieties. The population movements caused separation of linguistically 
close groups with the effect that related languages did not occur in clear geographic 
clusters. The interaction in a number of contact areas has led to new constellations 
involving convergence, innovation, mixture, leveling. Varieties with different 
backgrounds have developed common features. Several Turkic varieties have been 
used as koines, transregional codes for trade and intergroup communication, e.g. 
Azeri in Iran and the Caucasus region. Languages of the central part of the Turkic 
world have undergone a good deal of leveling, whereas those spoken in the 
periphery, e.g. Turkish, have preserved many older features. Languages such as 
Yakut, Salar, Yellow Uyghur, Khalaj, and Karaim have developed for centuries 
in relative isolation from their original close relatives, preserving old features and 
acquiring new ones in their respective environments. 
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2 Interfamily Contacts 


Turkic languages have for centuries been spoken in highly dynamic contact settings, 
entertaining numerous interfamily contacts with genetically and typologically dif- 
ferent varieties. Areas of intense contact include Central Asia, Siberia, Transcaucasia, 
Anatolia, the Balkans, the Volga-Ural region, and northwest Europe. Complex 
situations of copying and language shift have led to intricate contact-induced 
changes in linguistic subsystems. The contact languages include Indo-European, 
especially Iranian, Slavic, recently West European languages, and non-Indo- 
European languages such as Mongolic, Uralic, Tungusic, Chinese and Arabic. 

Because of the unique mobility of many Turkic-speaking groups, contact-driven 
developments have been especially important. The encounters have led to various 
contact-induced processes of borrowing, or, to use a more adequate term: copying. 
The languages involved have undergone processes of change and shift in different 
dominance relations determined by various sociocultural conditions. There have 
been two types of code interaction: “take-over” and “carry-over” copying (see 
Johanson 2008). Speakers have taken over copies from a foreign code into their 
native code, or they have carried over copies from their native codes into their 
own variety of a foreign code. Speakers of Turkic have thus taken over foreign 
lexical, morphological, phonological, and syntactic elements into their own vari- 
eties. Speakers of non-Turkic languages, e.g. Iranian, Finno-Ugric, Greek, Mongolic, 
Tungusic, Samoyedic, and Yeniseic, have shifted to Turkic and carried over native 
elements to their own varieties of Turkic, which has led to substrate effects of 
various kinds. 

Lexical elements and free function markers have been copied globally, as a whole, 
including their material shape (substance) and functions, i.e. properties of mean- 
ing, combinability, and frequency. They have also been copied selectively, as “loan 
translations” or “calques,” with respect to one or more semantic, combinational, 
and frequential properties, the material shapes being provided by indigenous 
morphological material. The same is true of affixes, ie. bound derivational and 
inflectional elements. Structural patterns have been copied selectively as semantic- 
combinational calques using indigenous morphemes. Phonological elements 
have been copied as elements occurring in loanwords, global lexical copies, or 
selectively, as elements copied onto indigenous units. 


3 Structural and Social Factors 


The likelihood of a particular element being copied in Turkic language contacts 
has been determined in part by social factors, such as the prestige of the model 
language, and in part by structural factors of “attractiveness” (Johanson 2002: 2-3, 
43-54). Attractive properties have been copied even in the absence of over- 
whelming social pressure. The presence of strong pressure has, however, ultimately 
led to the copying even of unattractive structures. If, in Turkic language contacts, 
a language has copied some element from another one, 
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ran 


the element might have been attractive, or 

2 the social influence of the model language on the copying language might have 
been sufficient to overcome the unattractiveness of the element, or 

3 both attractiveness and social influence might have been at work (see Comrie 

2002: viii-ix). 


The contacts between Turkic and non-Turkic languages show that, under appro- 
priate social circumstances, in particular contact that is sufficiently intense and 
sufficiently long-lasting, almost any feature from one language can ultimately be 
copied into another. Massive contact influence has sometimes caused consider- 
able deviations from the original typological profile of the languages involved. It 
has been possible for languages to copy structures that appear to be typologically 
inconsistent with the rest of their structure. Turkic languages spoken under 
strong foreign impact and in relative isolation from the bulk of their relatives have 
abandoned old features and developed new ones. Languages such as Karaim and 
Gagauz, both spoken in East Europe, have been strongly influenced by Slavic, 
displaying excessive copying of phonology, syntax, and lexicon (see e.g. Menz 
1999; Csaté 2000). 

The influence of Turkic on its contact languages has been great. Many aspects 
of Turkic structure have turned out to be attractive, but the sociolinguistic ques- 
tions concerning the nature of the migrations and political expansions have proven 
equally important. How did the speakers of the Turkic varieties enter the areas 
in question? Political expansion does not necessarily lead to linguistic expansion. 
The size of the relevant politically expanding incoming groups may have varied 
considerably. There were major, massive migratory movements and minor move- 
ments of a thin ruling layer. In both cases, code shifts took place. The question is: 
Did local speakers shift to the incoming code, or did incoming groups shift to the 
local code? There is no need to postulate massive immigration as a precondition 
for code shift. Even small incoming élites have imposed their codes on compar- 
atively large existing populations. In these cases we find Turkicized peoples of 
largely local origin without major demographic changes. A case in point seems to 
be the introduction of Azeri in the Transcaucasian area and Iran. A relatively small 
number of Turkic-speakers seems to have moved in, displacing the existing élites 
and causing the replacement of existing codes. This kind of introduction of Turkic 
may also have taken place in other areas. The incoming Turkic-speaking groups 
mostly had an advanced political organization which contributed significantly to 
their dominance. 


4 Examples of Contact Areas 


Some examples of major contact areas will be given below. It should be remem- 
bered that all languages belonging to the Russian sphere of influence show 
strong effects from Russian, since the second half of the nineteenth century at the 
latest. The influence is stronger in languages that were in contact with Russian 
relatively early, e.g. Tatar, Kazakh, and Yakut. 
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4.1. Central Asia 


Turkic and Iranian, genealogically unrelated and typologically different, have inter- 
acted for many centuries in Central Asia. The development dates back to early 
contacts of nomad groups in the Eurasian steppes. Longstanding intense contacts 
between southeastern Turkic and eastern Persian, the forerunners of Uzbek and 
Tajik, have led to considerable influence in both directions and to close symbiotic 
bonds; see Csaté et al. (2004), Johanson and Bulut (2006). Convergence processes 
resulted in numerous shared features in phonology, morphology, vocabulary, 
and syntax. Since the developments are highly complex, it is sometimes difficult 
to determine the direction of influence. It is often also difficult to pinpoint the 
developmental stages of the languages involved at the time of copying, i.e. to 
distinguish older and more recent changes. 

Large areas in Central Asia underwent increasing Turkicization. Turkic dialects 
were more or less Iranicized. They were first influenced by Soghdian and, after 
the Muslim conquest, New Persian, which took over the role of an interethnic lingua 
franca and was the medium through which Central Asian Turks became familiar 
with Islam and urban culture. Elements of Persian and Arabic origin were spread 
by merchants and religious teachers along the Silk Road. 

The Turkic varieties of eastern Turkistan, today’s Xinjiang, have been in contact 
with numerous languages. The oldest Turkic population of the area had close 
relations to speakers of Iranian. Strong substrate influences were exerted by 
speakers of Indo-European shifting to Turkic. Speakers of Old Uyghur moving 
into the northeastern part of the Tarim basin also came into contact with 
Tokharian, a non-Iranian Indo-European language. 

Uzbek has been significantly influenced by Persian. Its dialects exhibit various 
degrees of Iranicization. Uzbek and Tajik display many striking structural simi- 
larities. Certain shared features are due to Turkic influence on pre-Tajik eastern 
Persian varieties. An increasing Uzbek influence on Tajik may be observed. 
Northern Tajik has even been described, albeit inadequately, as a Turkic language 
“in statu nascendi” (Doerfer 1967: 57). 

The development of New Persian in the direction of the Turkic type is obvi- 
ous. Since the Middle Persian period, Persian had shared substantial typological 
characteristics with Turkic, developing into “the most atypical Iranian language” 
(Windfuhr 1990: 530). 

The Central Asian contact area also includes Mongolic influence on Kazakh, 
Kirghiz, etc., as well as Chinese and Tibetan influence on the Turkic languages 
of western China. 


4.2 Siberia 


South Siberia is a melting-pot of Turkic varieties characterized by contacts with 
Samoyedic, Yeniseic, Mongolic, and Russian. The historical and anthropological 
origins of the speaker groups are rather different. 

The varieties show various degrees of substrate and adstrate influence from 
Samoyedic and Yeniseic. Some groups speaking these languages have shifted to 
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Turkic quite recently. Substrate effects of South Samoyedic, which belongs to the 
Uralic family, have played a major role. The Turkicization of the speakers of Yeniseic 
varieties, e.g. Ket and Kot, was still in progress in the nineteenth century. Ket 
is the last survivor of Yeniseic. South Siberian Turkic shows clear Mongolic 
influences, in particular strong Oirat impact from the fifteenth century on. Tuvan 
has been influenced by Middle Mongol, Oirat, and Khalkha. Western Buryat was 
once spoken in the area, but has now vanished. Altay Tuvan as spoken in China 
displays phenomena induced by contact with Chinese. 

Russian has exerted strong lexical and syntactic impact on all South Siberian 
varieties and displaced some of them. 

Yakut, or Sakha, spoken in North Siberia, deviates considerably from other Turkic 
languages, from which it has been isolated for many centuries. It displays some 
unique innovations partly due to Mongolic and Tungusic influence. There is 
an old Buryat Mongolic layer from the period when ancestors of Yakut speakers 
settled on the shore of Lake Baikal. 

An early impact may have been exerted by Yeniseic, a formerly widespread 
Paleoasiatic language. After their emigration to North Siberia, the Turkic language 
of the Yakut underwent strong substrate influence from Tungusic dialects. The 
closest neighbors are still the North Tungusic languages Evenki, in the northern 
and northwestern parts of Yakutia, and Even, previously called Lamut, in the 
northeastern parts, both probably with Paleoasiatic substrates. Contacts with 
the isolated language Yukagir have also been important. Dolgan, a Yakut dialect 
spoken on the Taimyr and considered a language in its own right, has both an 
Evenki and a Samoyedic (Nganasan) substrate. The complex problems of language 
contact and language shift in the area are, however, still partly unsolved. 


4.3 Volga-Kama 


The Volga-Kama region has been a vital contact area for many centuries. The Turkic 
actors involved are Chuvash, Tatar, Bashkir, and their predecessors. The non-Turkic 
actors are the Finno-Ugric languages Mari, formerly called Cheremis, Mordvin, 
and Udmurt, formerly called Votyak, and their precursors, as well as Russian. 
The varieties show effects of long-term areal contact processes due to complex 
combinations of “take-over” and “carry-over” processes. Though the relations of 
social dominance have varied through the centuries, the processes have led to the 
introduction of new linguistic patterns, typical Sprachbund phenomena. 
Probably as early as the fifth century, groups speaking Kipchak Turkic were 
present on the middle Volga, absorbing local Finno-Ugric groups. Oghur Turkic 
influence came with the Volga Bulghars, who assimilated native groups of the 
region. Komi-Zyrian features indicate that intensive contacts took place between 
Volga Bulghar and Permic in the tenth century. Oghur Turkic is commonly thought 
to have influenced the Finno-Ugric and Russian varieties of the region, mainly in 
phonology. Oghur tribes came to dominate the Finnic groups on the left bank of 
the Volga, assimilating speakers of the predecessors of Meadow Mari and Udmurt. 
Chuvash, the only survivor of the Oghur Turkic type, displays substrate phenomena 
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due to close contacts with Volga Finnic. The influence is strongest in Upper Chuvash, 
especially in the immediate Mari neighborhood. 

With the Mongol invasion in the thirteenth century, Kipchak-speaking newcomers 
came to play a major role in the area. Tatar strongly influenced Mari, Udmurt, 
and Mordvin. No other Finno-Ugric languages have been so strongly influenced 
by Turkic as Mari and Udmurt. On Mari influence upon Chuvash, see Agyagasi 
(1998). 

The Turkic varieties of the area, for example, exhibit many loans of Middle Mongol 
origin, partly borrowed via Tatar. Certain Chuvash—Mongolic correspondences 
go back to early contacts of Oghur-speaking and Mongolic-speaking groups in 
South Siberia. 

The Russian impact increased rapidly from the middle of the sixteenth century 
on, i.e. after the fall of the Khanate of Kazan. 


4.4 Transcaucasia and Iran 


Turkic and Iranian have interacted for many centuries in Transcaucasia and Iran. 
The groups that moved southwestwards from Central Asia to establish the future 
Oghuz branch of Turkic interacted closely with Persian-speaking groups. The Seljuk 
groups who settled in Transcaucasia encountered speakers of other Iranian vari- 
eties, e.g. the Northwestern Iranian language Tati, closely related to Talysh, and 
Kurdish dialects. The Seljuk conquest of the eleventh century led to a massive 
Turkicization of the area, with many speakers of Iranian varieties shifting to Turkic. 
A good deal of the idiosyncratic features of Turkic of this area may thus be due 
to Iranian and other local substrates. 

Iranization is the most conspicuous feature of the Turkic varieties spoken in 
Iran; see, e.g., Karal (2001), Bulut (2006). Direct contacts with spoken Persian, devel- 
oped for many centuries in asymmetric settings, have left profound unidirectional 
influences at all linguistic levels in Azeri dialects: in South Oghuz (i.e. Kashkay 
and related varieties, and the transitional varieties between them), in Khorasan 
Turkic, and in the non-Oghuz language Khalaj. Varieties of the adjacent border 
regions of Iraq and southeastern Anatolia share many of their features. 

The non-Oghuz Turkic language Khalaj, spoken in central Iran, has been heavily 
influenced by Persian, Luri, etc., without losing its specific Turkic characteristics. 
The Iranicization of Kashkay, spoken in southern Iran, has become more dominant 
in the last decades. 


4.5 The Caucasus 


The Caucasus offers rich materials for studying genetically diverse languages that 
have been in contact for millennia. Turkic languages are young languages in 
the Caucasus region. The ancestors of Kumyk and Karachay-Balkar may have 
entered the area in the early Middle Ages. Karachay-Balkar was an entrant from 
the steppes that was pushed into the mountainous regions in the thirteenth 
century and later on driven to still poorer locations in the highlands. Noghay 
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arrived relatively late, after the end of the fourteenth century. Nationalist 
Turcologists, however, claim that Turkic has existed in the Caucasus for at least 
5,000 years. 

Karachay-Balkar, Kumyk, and Noghay have been influenced by their Caucasian 
neighbors — Kumyk by the Northeast Caucasian languages of Daghestan, the south- 
ern dialects of Kumyk especially by Dargwa of the Nakho-Daghestanian group, 
and the White Noghay dialect by Cherkes and Abazian of the Northwest Caucasian 
family, etc. 

In the 1930s, the Marr school of linguistics, founded by Nikolaj J. Marr 
(1864-1934), focused on Karachay-Balkar, which was considered a cross-breed 
of Turkic and Caucasian elements. This alleged status of Karachay-Balkar is not 
supported by linguistic data. There is a certain degree of Caucasian influence, includ- 
ing substrate influence following code shift. But Caucasian languages have left 
relatively little imprint even upon this language. 

All languages indigenous to the region have been in intense contact with 
Russian. 


4.6 Anatolia 


Anatolian Turkish has interacted with many languages: Indo-European such 
as Greek, Kurmanji, Zaza, Armenian, Judeo-Spanish; the Semitic languages 
Arabic and Syriac; and the Caucasian languages Cherkes, Georgian, and Laz. See 
Andrews (1989) for details on the current language situation. 

The history of settlement and assimilation of Turkic-speaking groups in Anatolia 
is long and complex. Oghuz-speaking groups settled in the Byzantine territory 
before the Seljuk immigration. The large Seljuk immigration began in the eleventh 
century, and the Turkicization of the indigenous populations probably started 
toward the end of the Seljuk rule. The political power of the Ottoman state had 
a strong impact on the processes of Turkicization. Its history is full of minor or 
major population moves, e.g. immigration of new Turkic-speakers, mostly from 
the east. The immigration increased with the extension of Russian rule in the neigh- 
borhood. Large masses of immigrants arrived after the annexation of the Crimea 
in 1783 and the definite subjugation of the Caucasian area in 1864. 

Due to political developments, the presence of non-Turkic languages is now 
rather reduced. The largest ones are the Iranian languages Kurmanji and Zazaki. 
Greek and Armenian are present almost only in Istanbul. On Greek as formerly 
spoken in Cappadocia, see Dawkins (1916). On traces of Greek in Trabzon, see 
Brendemoen (2002). Judeo-Spanish, a Hispanic variety spoken by Jews of Spanish 
origin, is now vanishing. Arabic is spoken by Muslims on the borders to Syria 
and Iraq and by Christians in and around Mersin. Neo-Aramaic is still spoken by 
small groups in the eastern provinces, especially in Hakkari. Caucasian varieties 
such as Laz, Georgian, Abkhas, Adyghe, and Cherkes are found in the northeast. 
The last speaker of Ubykh, of the Northwestern Caucasian group, died in 1992. 

Since direct contacts between the spoken languages are lacking, Persian impact 
is much weaker in Anatolia than in Iran. Ottoman Turkish was a typical member 
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of the languages of the Islamic cultural sphere, but as a result of language reform, 
the share of Arabic-Persian loans in modern Turkish has been drastically 
reduced. Anatolian Turkish as spoken in Cyprus has been in direct contact with 
Greek since the sixteenth century and with English since the end of the nineteenth 
century. 


4.7. The Balkans 


The Balkans have for many centuries been a region of intense multilingualism, 
where Oghuz and Kipchak Turkic varieties established contact with South Slavic 
and Albanian. In the Ottoman period, vast areas of the Balkans were colonized 
from Anatolia, which led to the creation of West and East Rumelian Turkish dialects. 
Large non-Turkic groups accepted Islam and were Turkicized. There is uncertainty 
about the numbers of Turkic-speaking colonizers and the groups of autochthonous 
converts whose local languages served as substrates. Some scholars tend to classify 
the Turkish dialects as creoles, i.e. nativized pidgins. Turkish has exerted extensive 
influence on Romani dialects. Some Roma groups have their own varieties of Balkan 
Turkish. 

The dominance of Turkish in the Ottoman period gave the Balkan languages 
many common features. The Turkish impact was first dealt with by Franz von 
Miklosich in 1884. Spoken Turkish has been important for the formation of the 
so-called Balkan Sprachbund of Slavic, Greek, Romance, and Albanian varieties 
sharing certain areal features. For an overview, see Friedman (2003). 

The autonomy of the Balkan peoples was followed by an abrupt decrease in 
Turkish influence. Turkish-speaking masses left for Turkey, especially after World 
War I. Through the population exchange with Greece, about half a million Turks 
from Greece emigrated to Anatolia. Turkish has thus changed from a dominat- 
ing language to a dominated language in the Balkans, but the linguistic contacts 
continue. West Rumelian Turkish is still spoken in Macedonia and Kosovo, East 
Rumelian Turkish mainly in Bulgaria. 

One language, Gagauz, has been subject to particularly strong Slavic impact, 
leading to drastic typological changes. A similar case, outside the Balkans, is Karaim, 
whose precursor was transplanted from the Crimea to Ukraine and Lithuania about 
600 years ago. 


4.8 Northwestern Europe 


During the last half century, considerable Turkish diaspora groups have emerged 
in northwestern Europe through immigration from Turkey and also from former 
Yugoslavia. Linguistically, they live in unbalanced, asymmetrical contact situations, 
their primary language being dominated by the languages of the host societies, 
i.e. German, Dutch, Danish, Norwegian, Swedish, etc., which are used for group- 
external communication. It is still uncertain whether the Turkish varieties 
spoken by persons raised in Europe will ultimately diverge from Turkish as 
spoken in Turkey. 
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5 Written Turkic Languages 


A short overview of the development of written Turkic languages is given in Menges 
(1968: 165-73). The oldest documents of written Turkic, the Orkhon and Yenisey 
inscriptions, do not exhibit any loans except some Chinese and Iranian titles. The 
Old Uyghur literature of eastern Turkistan consists of Buddhist, Manichean, and 
Nestorian texts, almost exclusively translations. It displays loans from several non- 
Turkic languages — Sanskrit, Tokharian, Soghdian, etc. — mostly pertaining to the 
religious, philosophical, political, and legal domains. It also documents interest- 
ing Turkic calques of Buddhist terminology. 

This literary culture came to an end with the Islamic conquest. In the Islamic 
successor languages of Old Uyghur, non-Arabic terms in the areas of spiritual, 
social, and political life were not tolerated. Foreign words and numerous Turkic 
words were replaced by Arabic-Persian ones. The Turkic literary languages that 
emerged in the late Middle Ages were subject to strong Persian influence from 
the very beginning. Chaghatay and Ottoman were thoroughly influenced by a 
prestigious Persian-Arabic vocabulary that gave them a remarkable richness of 
expressive resources. Most Chaghatay authors were probably bilingual in Turkic 
and Persian. From the fifteenth century on, stylistically refined registers, overloaded 
with Persian-Arabic elements, emerged in Ottoman élite literature. The abundance 
of Arabic-Persian loans in Ottoman led to strong puristic efforts in the twentieth 
century. In spite of all differences, it is a fact that most written Turkic languages 
have been influenced by Persian and Arabic. A much more recent process is the 
strong Russian influence that the Turkic literary languages of the Russian political 
sphere have been subject to. 


5.1 Copied lexicon 


Turkic languages have been exposed to a good deal of foreign lexical influence. 
Already Old Uyghur displays lexical copies from Chinese, Soghdian, and 
Tokharian. The vocabulary of modern languages, particularly Modern Uyghur, 
still mirrors these and other manifold contacts (Yakup 2005). 


5.2. Arabic-Persian loans 


Most Turkic languages possess words of Arabic and Persian origin that have partly 
superseded the native Turkic vocabulary. Persian lexical influence, which also 
includes Arabic words introduced via Persian, has been very strong in all Turkic 
languages of the Islamic world, covering various domains of Islamic culture 
and representing abstract concepts and concrete concepts pertaining to Oriental 
urban life. They represent all fields of traditional Islamic society. Many words are 
inherited from the Chaghatay literary tradition. Karakhanid was the first Turkic 
written language to contain Arabic-Persian loans. Numerous loans are found in 
modern Azeri, South Oghuz, Khorasan Oghuz, Turkmen, Uzbek, Uyghur, Tatar, 
Bashkir, Kazakh, Noghay, Karakalpak, Kirghiz, etc. 
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Examples of Persian loans: 

Azeri kor ‘blind’, rink ‘color’, pul ‘money’ 

Uzbek gost ‘meat’, Idy ‘mud’, parda ‘curtain’, nan ‘bread’ 

Turkmen giil ‘flower’, Sa:t ‘glad’ 

Tatar zur ‘big’, bakca ‘garden’, atna ‘week’, siihir ‘city’, Carsaw ‘curtain’ 
Chuvash ¢arsav ‘curtain’ 

Kazakh apta ‘week’, kiind ‘crime’, nan ‘bread’, bazar ‘market’ 
Karakalpak diydar ‘face’ 

Kirghiz kiind: ‘sin’, Sa:r ‘city’, bakca ‘garden’ 


Arabic loans copied from Persian: 

Azeri kef ‘pleasure’, dua prayer, giidiir ‘amount’ 

Uzbek miimkin ‘possible’, nihdyat ‘at last’, kuwwat ‘strength’ 

Tatar miiktip ‘school’, taraf ‘side’, vakit ‘time’ 

Chuvash vdxdt ‘time’ 

Kazakh akil ‘intellect’, yilim ‘science’, mayina ‘meaning’, wakit ‘time’ 
Turkmen in@a:n ‘human being’, xat ‘letter’ 


Ottoman Turkish displayed an overwhelming number of Arabic-Persian loans, 
which ousted a considerable part of the native vocabulary. Even at the end of the 
Ottoman era, terms designating phenomena of the modern world, e.g. political 
and scientific terms, were mostly coined by means of Arabic devices. The modern 
Turkish lexicon still possesses a significant Arabic-Persian component, though puris- 
tic language reform has weakened its dominance. Numerous old words have been 
abandoned and replaced by so-called Oztiirkce (‘Pure Turkish’) neologisms. 

Azeri has preserved numerous words of Arabic-Persian origin. The varieties of 
Iran also possess numerous copies from spoken Persian. There has not been any 
radical language reform comparable to the Turkish one. We thus find Azeri words 
such as liiydt ‘dictionary’ versus Turkish s6zliik, miidllim ‘teacher’ versus Turkish 
6gretmen, pul ‘money’ versus Turkish para. A good deal of the Turkish neologisms 
are not intelligible to native speakers of Azeri. 

South Oghuz and Khorasan Oghuz exhibit a strong Persian influence in their 
lexis. Turkmen and Uzbek copied words from Persian and inherited words from 
the old literary language Chaghatay. Longstanding intensive Uzbek contacts with 
Iranian have resulted in numerous loanwords. Even female gender is expressed 
in some borrowed nouns, e.g. Uzbek Saird ‘poetess’ vs. Sdir ‘poet’. 

Modern Uyghur exhibits numerous Arabic-Persian loans, introduced via urban 
varieties of Uzbek and through the Islamic literature, i.e. the common Central Asian 
heritage transmitted by the written language Chaghatay. Though the strong 
influence has now decreased, about one fifth of the vocabulary is still of Arabic- 
Persian origin. 

The Arabic-Persian loans found in Kazakh have entered via Tatar and Chaghatay. 
Karakalpak possesses additional loans introduced via Uzbek and Turkmen. Also 
Noghay has preserved many Arabic-Persian words of the cultural vocabulary 
typical of the pre-Soviet era. 
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Even in Kirghiz, Arabic-Persian lexical elements form a considerable part of 
the lexicon. They have been copied from Chaghatay and Uzbek, particularly into 
the southern dialects. The northern dialects, on which the standard language is 
based, are less influenced by the Islamic vocabulary. 

Middle Kipchak displays a great number of Persian words and Arabic words 
copied from Persian. Modern Kipchak Turkic languages, e.g. Tatar and Bashkir, 
have numerous loans of Arabic-Persian origin. Most of the loans in Chuvash have 
entered via Tatar, though certain words were borrowed as early as the Volga Bulgar 
period (Scherner 1977). 

As a rule, Turkic languages outside the Islamic cultural sphere do not exhibit 
Arabic-Persian loans. However, some South Siberian languages offer a few 
exceptions, e.g. Khakas nan ‘bread’ or Altay Turkic urmat ‘reputation’, ultimately 
going back to Persian na:n and Arabic hurmat, respectively. 


5.3 Slavic loans 


All languages in the Russian influence zone exhibit Russian loans. They mostly 
represent phenomena of modern life, aspects of Russian and European civiliza- 
tion, technical, scientific and administrative matters, and political and social 
concepts of the Soviet era. Many recent internationalisms have been borrowed 
via Russian. Russian loans mostly constitute a recent layer, introduced at the end 
of the nineteenth century at the earliest and during the Soviet era at the latest. 
In languages entertaining old contacts with Russia, words that have become 
archaisms in Russian may still be part of the active lexicon. 

Numerous loans are found in Yakut, Chuvash, Tatar, Bashkir, South Siberian 
Turkic, Kazakh, Kirghiz, Uzbek, Noghay, Uyghur (particularly outside Xinjiang), 
etc., e.g. Uzbek plan ‘plan’, stul ‘chair’, stakan ‘tumbler, glass’, studentka ‘female 
student’; Turkmen po6yolok ‘settlement’, gadyet ‘newspaper’, fe:rma ‘farm’; Tatar 
par ‘steam’, kuxnya ‘kitchen’, vrac ‘medical doctor’; Chuvash xasat ‘newspaper’, 
kéneke ‘book’; Kazakh stol ‘table’, kerewet ‘bed’; Uyghur aptomobil ‘car’, kastum 
‘costume, suit’. 

Russian impact on the modern vocabulary of Northern Azeri is strong, whereas 
it is almost absent in Southern Azeri, which has many loans from Persian. Thus 
Northern Azeri stol ‘table’ corresponds to Southern Azeri miz, aftobus ‘bus’ to otobus, 
kartof ‘potato’ to yer almast, gdzet ‘newspaper’ to ruznamd, universitet ‘university’ 
to daniggah, caynik ‘tea pot’ to caydan, etc. Due to the different political and cul- 
tural development over the last six hundred years, the Northern Azeri and Turkish 
vocabularies differ from each other in many respects. Loans from Russian often 
correspond to Turkish loans from West European languages, e.g. Azeri zavod 
‘factory’ versus Turkish fabrika, galstuk ‘necktie’ versus kravat, lampa ‘lamp’ ver- 
sus lamba. 

In the post-Soviet period there are tendencies to reduce Russian loans in favor 
of native or Arabic-Persian words. But Russian is still the language of commu- 
nication in professional domains, education, and science in many parts of the 
former Soviet Union. The native terminologies that are now being developed are 
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thus often rather restricted to textbooks. In informal spoken registers, copies from 
Russian are still often used. 

Loans from other Slavic languages are numerous in dialects of the Balkans and 
some East European varieties. Numerous lexical items of Gagauz and Karaim are 
copied from Slavic. The lexicon of West Rumelian Turkish is heavily influenced 
by contacts with Slavic. Turkish used to be a rich lexical source for the Slavic con- 
tact languages, providing words relating to everyday life as well. This situation 
has changed radically. In Bulgaria, campaigns have been conducted against the 
use of Turkish words in local Turkish dialects and in favor of their replacement 
by Bulgarian words. The change in dominance relations is shown by the fact 
that colloquial West Rumelian Turkish is reborrowing loans in the shape they 
have in the languages of Macedonia and Kosovo that originally borrowed them, 
e.g. piper ‘pepper’ instead of biber, sapun ‘soap’ instead of sabun, pita ‘flat bread’ 
instead of pide, taSliya ‘stony’ instead of tasli (Friedman 2006: 42). 


5.4 Mongolic loans 


Mongolic loans are found in many Turkic languages, Yakut, Tatar, Bashkir, 
Noghay, Kazakh, Kirghiz, South Siberian Turkic, Uyghur, etc. For an overview, 
see Schénig (2003); on loans in West Oghuz Turkic, see Schénig (2000). Languages 
spoken by nomadic groups have borrowed numerous Mongolic words as a result 
of close contacts especially in the Middle Ages. A number of languages have loans 
of Middle Mongolian origin. The literary language Chaghatay displays loans from 
the domains of warfare and administration. Most Yakut words of foreign origin 
are Mongolic loans. There is an old Buryat layer from the early period of settle- 
ment on the shore of Lake Baikal. Kirghiz possesses numerous old loans, e.g. sonun 
‘remarkable’, diiléy ‘deaf’, belen ‘ready’. Due to close contacts with other nomadic 
groups, Kazakh exhibits many words of Mongolic origin, e.g. olja ‘booty’, kunan 
‘colt in the third year’. Most of them date back to the eighteenth century, when 
Kazakh and western Mongol tribes fought in the steppes. Languages of the Volga 
region exhibit loans from Middle Mongolian, e.g. Tatar uram ‘street’, dala ‘steppe’. 
Chuvash copied its words of Mongolic origin from Tatar after the thirteenth 
century. Yellow Uyghur, spoken in western China, exhibits direct loans from neigh- 
boring Mongolic varieties. 

In South Siberian Turkic, Mongolic cultural loans often play a role comparable 
to that of Arabic-Persian loans in Islamic Turkic languages. There is an older 
Mongolic layer common to all Sayan varieties. Even kéz ‘eye’ has been replaced 
by garak throughout the Sayan area. A West Mongolian layer goes back to the 
Oirat rule of the fifteenth to seventeenth centuries. There are also borrowings from 
Buryat, particularly in Tofan. Tuvan possesses loans from Mongolic in all word 
classes. It exhibits strong Khalkha influences dating back to the rule of the Altan 
Khans. Some recent words copied from Khalkha from 1930 to 1945 reflect modern 
spoken forms and are usually neologisms used as synonyms of Russian loans, 
e.g. xuviska:l ‘revolution’ for revolyucya. Dukhan, spoken in northern Mongolia, 
has been subject to strong influence from Khalkha and Darkhad Mongolian 
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during the last six decades. Many of its loans have abstract meaning, e.g. Sasin 
‘religion’, janjil ‘habit’, domok ‘legend’, jayan ‘destiny’, amdiral ‘life’ (Ragagnin 2008). 


5.5 Chinese loans 


On Chinese loans in Old Uyghur, see Menges (1968: 168-9). The modern Turkic 
languages spoken in China are strongly influenced by Chinese. Older Chinese 
loans are found in all varieties of modern Uyghur (Jarring 1964), but the lexical 
influence has become increasingly stronger. Most loans have entered through recent 
contacts with western Mandarin dialects of Xinjiang. Many of them denote items 
introduced by Chinese immigrants, e.g. joza ‘table’, manta ‘stuffed bun’. Many 
Chinese neologisms denoting technological, political, bureaucratic, and military 
innovations have been copied. Not all are used in spoken registers, and most are 
lacking in Uyghur varieties outside Xinjiang. In the 1960s, the use of Chinese 
scientific terminology was obligatory. There is now a tendency to replace Chinese 
words by means of Turkic word formation devices and loan translations. 

The spoken varieties of Kazakh and Kirghiz spoken in Xinjiang exhibit a certain 
number of Chinese loans, but the written varieties are relatively little affected. 
Yellow Uyghur and Salar, spoken outside Xinjiang in western China, display 
various loans from neighboring Chinese varieties. 


5.6 Uralic loans 


Turkic languages of the Volga region, particularly Chuvash, possess many elements 
copied from neighboring Volga-Finnic varieties. Mari loans are very common in 
Upper Chuvash, especially the Sundyr dialect, testifying to a Finnic substrate due 
to the assimilation of a local population, e.g. lépé ‘butterfly’, yantar ‘glass’, piirt 
‘house’. Many of the copied words have now vanished in Mari. Mari also has 
a large number of Turkic loanwords. For a discussion of the initial contacts, see 
Roéna-Tas (1988). Tatar and Bashkir words of Finno-Ugric origin mainly occur in 
the dialects. 

A layer of loanwords in Chuvash indicates old contacts with Samoyedic vari- 
eties in southwestern Siberia. The modern South Siberian languages exhibit 
South Samoyedic loans as substrate products. The loans primarily belong to the 
sphere of reindeer-breeding, hunting, fishing, and botany (Helimski 1995). 


5.7. Other loans 


Words of Greek and Armenian origin are found in many Turkic languages of the 
western sphere. Karaim has an old layer of Hebrew loanwords and many copies 
from Lithuanian. Balkan Turkic varieties possess loans from Albanian and 
Rumanian. Yakut and other Siberian languages have loans from Tungusic, often 
belonging to the domains of husbandry and everyday life. Many Yakut elements 
are probably due to contact with Paleoasiatic languages. Words of Yeniseic origin 
are found in South Siberia. Salar and Yellow Uyghur display many Tibetan loans 
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from neighboring languages. There are many words of unknown origin, e.g. in 
Chuvash. 

Turkish, the most developed Turkic language, exhibits, in addition to the loans 
already mentioned, numerous lexical elements, e.g. technical, scientific and political 
terms, from Greek, Italian and French, and after World War II also significantly 
more from English (Tietze 1990). For Turkish nautical terms of Italian and Greek 
origin, see Kahane, Kahane, and Tietze (1958). 


5.8 Copied verbs 


Verbal stems are seldom copied as such, e.g. Kirghiz kara- ‘to look’, Tatar tukta- 
‘to stop’, Tuvan dgiild- ‘to start’. A more common way to accommodate copies to 
function as verbs is derivation of nominals, i.e. nouns and adjectives, by means 
of suffixes, e.g. Turkmen harc-la- ‘to spend’ < harc ‘expenditure’. Verbs can also 
be formed analytically by means of auxiliary “light” verbs, e.g. Turkmen harc 
et- ‘to spend’. These compounds are lexicalized verbal phrases which form one 
syntactic constituent. The nominal element does not function as a free object, and 
it normally cannot be separated from the auxiliary verb by other elements than 
particles meaning ‘also’ or ‘even’. 

Light verbs meaning ‘to do’ include dt-, dyli-, kil-, kin-, yap-, etc. The nominal 
is a petrified element, often an Arabic verbal noun, e.g. Azeri niyyet elii- ‘to intend’ 
< niyyet ‘intention’, or a Slavic infinitive, e.g. Karaim zvont’ et-, Yakut svoni gin- 
‘to phone’ < zvonit’, birasti gin- ‘to apologize’ < prostit’, Armeno-Kipchak vikupit 
et- ‘to ransom’ < vykupit’. 

This type is common as early as Old Uyghur, e.g. kSanti kil- ‘to confess’. Other 
examples: Turkish memnun et- ‘to satisfy’, spor yap- ‘to do sport’, Kazakh iimit kil- 
‘to hope’, Middle Kipchak nama:z kil- ‘to pray’, niyyet elii- ‘to intend’, Ottoman 
irsa:l et- ‘to send’, Karachay-Balkar razi et- ‘to satisfy’. 

Ottoman Turkic uses dt-, dyld- and kil- ‘to do’. Modern Turkish normally uses 
et-, but has also introduced yap-. Azeri displays eli-, et-, and gil-. The verb dit- is 
used in the rest of Oghuz Turkic, in Kipchak Turkic, Kipchakoid South Siberian 
varieties, and Salar. It is not used in Sayan Turkic and Yakut. The verb dyla- < 
Géla- is still used in the west, mainly in Southwest Turkic and Northern Kipchak 
Turkic, e.g. Azeri puside eld- ‘to cover’, cf. Persian pusi:dan. The old auxiliary kil- 
is preserved in Southeast Turkic and Sayan Turkic. Oghuz and Kipchak kil- is 
mostly confined to old formations of elevated style, e.g. Turkish namaz kil- ‘to per- 
form ritual worship’. The form kin- is found in the Sayan Turkic language Tofan, 
gin- in North Siberian Yakut. The old verb tu- ‘to do’ is used in Chuvash, e.g. 
astu- ‘to remember’ < as ‘memory’. 

Intransitive verbs are formed with nominals, often participial forms, plus 
bol- ~ pol- ~ ol- ‘to be, to become’, e.g. Ottoman Turkic na:il ol- ‘to obtain’, za:yi 
ol- ‘to be lost’, Sa:d ol- ‘to rejoice’. 

Iranian languages use similar methods to accommodate borrowed Turkic 
verbs, combining a nominal form with a native auxiliary verb such as ‘to do’, e.g. 
Tajik amr kardiin ‘to give an order’. It is also common to use nominal forms in the 
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now unproductive Turkic marker -mis to copy Turkic verbs, e.g. Tajik tugulmig 
karddn ‘to be born’, cf. Uzbek tuyil-. 


5.9 Copied function words 


Turkic languages, both spoken and written, have shown a propensity for copy- 
ing function words, e.g. conjunctions, postpositions, and discourse markers, from 
their contact languages. Languages of the Islamic sphere have copied from 
Persian, particularly all Persian-influenced varieties of Iran; those of the Russian 
sphere of influence have also copied from Russian, e.g. no ‘but’, i ‘and’, ili ‘or’ 
(Johanson 1997): Uzbek goyd ‘as if’; Turkmen we ‘and’, xem ‘also’; Tatar ham ‘and’, 
amma ‘but’, dgiir ‘if’; Turkish ve ‘and’, ciinkii ‘for’, geri ‘though’, eger ‘if’, no ‘but’; 
West Rumelian Turkic a ‘and, but’. Many borrowed junctors are complex, e.g. ta: 
ki ‘until’. Some varieties in Iran have even copied Persian prepositions, e.g. Khalaj 
bi: sin ‘without you’. The syntactic features of Slavic subordinative junctors are 
sometimes selectively copied (as “calques”), e.g. Gagauz ani ‘what, which’. 

Copying of temporal, purposive, causal and other conjunctions is connected with 
a reduced use of Turkic participial constructions. The elements are often integrated 
into Turkic syntax in a way different from their behavior in the model languages. 
The free junctor ki ‘that’ (etc.) has a broad functional scope, serving as a general 
connector preceding relative and complement clauses. 

The Volga-Finnic language Mari has copied postpositions and particles from 
Tatar and Chuvash, e.g. kérd ‘in view of’, ‘because of’, the superlative particle en, 
and the interrogative particle mo. 


5.10 Copied affixes 


Copying of bound morphology is in general known to be relatively unattractive, 
but there are clear instances of non-Turkic languages copying bound morphemes 
from Turkic, including both derivational morphemes and inflectional mor- 
phemes. This may be due to the agglutinative nature, the low level of fusion, of 
Turkic morphology, i.e. the frequent one-on-one correspondence between gram- 
matical categories and their exponents. Suffixes seem to be more or less pervious 
to copying depending on their position in the word. 

Turkic derivational morphemes such as the agentive nominalizer -ci have 
been copied into many contact languages. On the other hand, Balkan Turkic shows 
Slavic borrowings, e.g. markers of feminine nouns and diminutives. Turkic has 
copied comparative markers, e.g. Iranian Turkic -tar, whereas Tajik uses the cor- 
responding Turkic suffix -raq. In spite of the dominant suffixing morphology, the 
strong Iranian influence on Uzbek has led to the emergence of prefixes, e.g. 
na-toyri ‘untrue’ with nd- ‘non-’ copied from Persian + Turkic toyri ‘right, true’. 
Inflectional morphemes, e.g. case and person-number markers, have been copied 
into varieties of Tajik and Anatolian Greek, respectively. The bound morphology 
of Mari has undergone strong Turkic impact, e.g. copying of word-formative and 
case suffixes. 
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5.11 Adaptation of loans 


Loanwords are subject to phonological adaptation (integration, assimilation) of 
different kinds in Turkic languages, i.e. the pronunciation is more or less accom- 
modated to the indogenous sound systems. There are different styles or registers 
of pronunciation according the speakers’ proficiency in the model languages. 
Monolinguals have mostly tended to adapt the shape of loans to their native 
systems. 

Thus, older Russian loans show a high degree of adaptation, e.g. Yakut silaba:r 
‘samovar’, biragra:mma ‘program’; Kazakh kerewet ‘bed’ < Russian krovat’; Chuvash 
apat < Russian obed ‘dinner’. Loans belonging to the domains of modern life are 
less adapted. Items copied before the formation of the modern standard languages 
are often written as they were pronounced at the time of copying. The adaptation 
is sometimes reflected in old Arabic- and Roman-based orthographies. Since the 
introduction of the Cyrillic script, Russian loans are written in their original graphic 
shape, which obscures possible adaptations. Though orthoepic norms recommend 
native Russian pronunciation, the loans may still be pronounced according to 
indogenous rules. 

Arabic-Persian loans confront us with corresponding problems. The Arabic script 
often represents them in shapes indistinguishable from those of the model languages. 
Turkic languages differ with respect to the degrees of adaptation. Items copied 
directly from spoken Persian have generally accommodated closer to the Turkic 
phonological system, e.g. Kirghiz ubakti ‘time’ < Arabic waqt, jo:p ‘answer’ < Arabic 
jawa:b (Johanson 1986). 

It is unknown to what extent lexical copies in the Old Uyghur language were 
adapted to Turkic phonology and phonotactics, or stand for marginal sound struc- 
tures in learned pronunciation. 

Certain marginal sounds only occur in loans and are not common to all spoken 
styles. Long vowels appear in, at least conservative, pronunciation of Arabic-Persian 
loans. The consonant 8¢, written 14, mostly pronounced as a long palatal fricative 
§:, occurs in more recent Russian loanwords. The affricate c (ts) occurs in loans 
from Russian and Chinese. The labiodental fricatives fand v are marginal in many 
languages. The glottal h occurs primarily in words of Arabic-Persian origin. 
Initial h- is sometimes dropped, e.g. Noghay ar ‘each’ < har, Kirghiz apta ‘week’ 
< hafta. The consonant Z occurs primarily in words of Persian and French origin, 
e.g. Azeri Ziist ‘gesture’ < geste. 

The glottal stop ? in words of Arabic-Persian origin is realized in certain languages, 
e.g. Tatar mas?iilii ‘question’, Uyghur tibi?i ‘natural’. It may even occur in learned 
Turkish pronunciation, e.g. sazat ‘hour’. Noninitial glottal stops also occur in Uyghur 
loans from Chinese, e.g. fay?in ‘scheme’. Also the pharyngeal f is realized as a 
glottal stop. Both consonants may be lost, sometimes compensated by vowel length, 
e.g. Azeri te:sir ‘influence’, Uyghur masila ‘question’. In certain languages, they 
are represented by the voiced fricative y, e.g. Tatar yaddt ‘habit’, sayir ‘poet’. 

Loans are more or less adapted to Turkic phonotactic rules. Though certain con- 
sonants, e.g. n and r, do not occur initially in native words, they may be realized 
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as such in loans, e.g. Yakut na:da ‘necessary’. The initial position can be avoided 
by prothetic vowels, e.g. Turkish istasyon ‘station’, Turkmen irayo:n ‘district’, 
Kazakh iras ‘true’ < Persian rast. Nonpermissible consonant clusters are dissolved 
through consonant deletion or insertion of epenthetic or prothetic vowels, e.g. 
Turkmen u660ol ‘table’ < stol; Uyghur pikir ‘thought’ < fikr, cilen ‘member’ < clen’; 
Kazakh xaltk ‘people’ < xalq; Tatar dus < dost ‘friend’; Turkish isim ‘name’ < ism. 


5.12 Phonological (material) copies 


The introduction of less adapted loanwords has often affected the phonological 
systems of Turkic languages. Two examples: Gagauz and Karaim have aquired 
palatalized front consonants due to Slavic influence. Chuvash displays palatal- 
ized consonants before and after front vowels. In Istanbul and Balkan dialects, k 
and g are palatalized before front vowels, e.g. gil ‘come’, k‘itap ‘book’. Urban Uzbek, 
Iranian Turkic, West Rumelian Turkish, etc. possess vowels influenced by 
Iranian or Slavic, retracted variants of 6 und ti, and absence of ¢. The typical Turkic 
sound harmony structures have been disturbed in many dialects under Persian 
influence. 


5.13. Copies of combinational properties 


Copying of combinational properties has played an important role in Turkic con- 
tact situations. They have affected word structures and syntactic constructions, 
e.g. patterns of word order and clause combining. Copying of combinational and 
semantic properties has led to restructuring of morphosyntactic subsystems, e.g. 
aspect-mood-tense and case systems (Boeschoten & Johanson 2006). 

The contacts have sometimes been sufficiently intensive and long-lasting to pro- 
vide Turkic varieties with grammatical components strongly modeled on non-Turkic 
patterns. Examples include Slavic influence on Karaim, Persian influence on 
Kashkay, etc. Vice versa, non-Turkic languages have developed grammatical com- 
ponents patterned after Turkic models. One case in point is the Turkish influence 
on Greek dialects spoken in Central Anatolia. 

Copying of combinational and semantic properties will not be dealt with in 
length here. For details, see Johanson (2002; 2008). However, the following are 
some examples. 

Word order has been an important domain of contact influence. Shifts in sen- 
tential word order have, however, mostly affected the use of existing structures 
rather than leading to acquisition of new structures. Pragmatically marked word 
orders have often been treated as unmarked. In Karaim and Gagauz, strong 
foreign influence has led to the weakening of the basic verb-final word order in 
favor of the verb-object order. Whereas Turkic generally places relative clauses 
before their heads, Irano-Turkic varieties prefer postposed relative clauses of 
the Persian type. Due to Persian influence, some Turkic varieties spoken in Iran 
display prepositions. Karaim has developed prepositions under Slavic influence. 
Karaim and some Balkan varieties display cases of genitive—-head reversal, e.g. 
West Rumelian Turkish baba-si Ali’nin ‘the father of Ali’. 
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Morphosyntactic patterns have proven highly susceptible to contact-induced 
changes. Many Turkic languages have been strongly affected in the domain of 
complex sentences. The weakening or loss of nonfinite constructions, e.g. Turkish 
gel-dig-in-i duy-du-m <come-VERBAL.NOUN-POSSESSIVE.3SG-ACCUSATIVE hear- 
PAST-1SG> ‘I heard that (s)he has come’, is a typical tendency of Turkic in 
contact with Indo-European, e.g. Karaim tuy-du-m k’i k’el’-d’i <hear-PAST-1SG JUNC- 
TOR come-PAST-1SG >. Conjunction-marked postposed clauses with finite verb 
morphology have been introduced. In West Rumelian Turkish, this expanded usage 
is calqued from subordinate clauses in Macedonian and Albanian. Discourse 
motivations for calquing of relativizing and coordinating conjunctions is discussed 
in Matras (2006). Some non-Turkic varieties, e.g. Tajik Persian, have acquired 
a head-final clause subordination system with patterns copied from Turkic 
syntax. 

Turkic emulations of foreign hypotaxis are often subject to constraints, the 
constructions being integrated in a way different from those in the model lan- 
guages. The junctors (relators, relativizers) used, e.g. ki ‘that’, do not always mark 
embedded clauses in the way Turkic subordinative devices do. The question of 
whether or not the calqued constructions represent genuine hypotaxis should be 
investigated further; see Johanson (1975). 


6 Conclusions 


Turkic languages present particularly rich sources of data for the study of language 
contact. The continuous massive displacements of Turkic-speaking groups through- 
out their history have led to numerous encounters, to various contact-induced 
processes of change and shift, and to new linguistic configurations under differ- 
ent sociocultural conditions. 

Though speakers of Turkic often entered new areas in the course of political 
expansion, this was far from always tantamount to linguistic expansion. There 
were major, massive migratory movements and minor movements of thin ruling 
layers. In some cases, local speaker groups shifted to the incoming codes; in other 
cases, incoming groups shifted to the local codes. 

Varying and intensive intrafamily contacts between highly mobile Turkic- 
speaking groups led to reciprocal influence, to splits between closely related 
varieties, and to new constellations, in particular in the old heterogeneous tribal 
confederations. 

Turkic entertained close interfamily contacts with an astonishing number of 
genealogically and typologically different languages: Indo-European such as Iranian 
and Slavic, non-Indo-European such as Mongolic, Finno-Ugric and Tungusic. Major 
contact areas include Central Asia, Siberia, the Volga-Kama region, Caucasus, 
Transcaucasia, Iran, Anatolia, the Balkans, and, recently, northwestern Europe. 

Certain contacts have been sufficiently intensive and long-lasting to produce 
thorough linguistic changes. The grammatical components of some Turkic languages 
are almost totally modeled on non-Turkic patterns; those of some non-Turkic 
languages similarly patterned after Turkic. 
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Factors determining linguistic copying have in part been social, “prestige” factors, 
and in part structural, “attractive” features. Under appropriate social circumstances, 
almost any features seem to have been copied, even those that appear to be typo- 
logically inconsistent with the rest of the structure of the copying language. 

Various types of copied features have be mentioned above. Turkic has copied 
parts of its lexicon, even function words, from Arabic-Persian, Slavic, Mongolic, 
Chinese, Uralic, etc. Loanwords have been subject to phonological adaptation of 
different kinds. The introduction of less adapted loanwords has often affected the 
phonological systems. Copies of combinational properties have affected word struc- 
tures and syntactic constructions, e.g. patterns of word order and clause combining. 
Copying of combinational and semantic properties has resulted in restructuring 
of morphosyntactic subsystems, e.g. aspect-mood-tense and case systems. 

Turkic language contacts have long been subject to detailed investigation, most 
intensively in the last decades. A good deal of research still remains to be done, 
in particular since the amplitude and the richness of the contact-induced phenomena 
observed here appear to be of paradigmatic value for the study of language 


contact in general. 
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33 Contact and North 
American Languages 


MARIANNE MITHUN 


Languages indigenous to the Americas offer some good opportunities for inves- 
tigating effects of contact in shaping grammar. Well over 2000 languages are 
known to have been spoken at the time of first contacts with Europeans. They 
are not a monolithic group: they fall into nearly 200 distinct genetic units. Yet against 
this backdrop of genetic diversity, waves of typological similarities suggest 
pervasive, longstanding multilingualism. Of particular interest are similarities of 
a type that might seem unborrowable, patterns of abstract structure without shared 
substance. 

The Americas do show the kinds of contact effects common elsewhere in the 
world. There are some strong linguistic areas, on the Northwest Coast, in California, 
in the Southeast, and in the Pueblo Southwest of North America; in Mesoamerica; 
and in Amazonia in South America (Bright 1973; Sherzer 1973; Haas 1976; 
Campbell, Kaufman, & Stark 1986; Thompson & Kinkade 1990; Silverstein 
1996; Campbell 1997; Mithun 1999; Beck 2000; Aikhenvald 2002; Jany 2007). 
Numerous additional linguistic areas and subareas of varying sizes and strengths 
have also been identified. In some cases all domains of language have been affected 
by contact. In some, effects are primarily lexical. But in many, there is surprisingly 
little shared vocabulary in contrast with pervasive structural parallelism. The focus 
here will be on some especially deeply entrenched structures. 

It has often been noted that morphological structure is highly resistant to the 
influence of contact. Morphological similarities have even been proposed as 
better indicators of deep genetic relationship than the traditional comparative 
method. In his attempts to group North American families into larger superstocks, 
Sapir was adamant that morphology outlasts cognates: “so long as such direct 
historical testimony as we have gives us no really convincing examples of pro- 
found morphological influence by diffusion, we shall do well not to put too much 
reliance in diffusion theories” (1921: 206). The principle seems reasonable. The 
internal structure of words is generally less accessible to the consciousness of 
speakers, and, one would expect, less easily manipulated by bilingual speakers 
seeking to bring structures from one of their languages into the other. 
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Yet numerous morphological parallelisms appear in neighboring but genetically 
unrelated American languages. It might seem difficult to imagine how such struc- 
tures could be transferred under contact: they involve abstract, largely unconscious 
patterns without the words or morphemes that carry them. Here some mechanisms 
will be suggested that might result in such transfers. 


1 Detecting Contact without Philology 


For most languages of the Americas, there are no written records comparable 
to those for major languages of Europe. Many communities did not encounter 
Europeans until the late eighteenth or nineteenth century. It is thus not generally 
possible to trace the effects of contact philologically, particularly grammatical 
patterns that develop gradually. Alternative strategies must often be explored. 

The clearest evidence of contact is of course loanwords. Many languages of the 
Americas show the same kinds of lexical loans as languages elsewhere. The word 
hayu ‘dog’, for example, appears in neighboring but genetically unrelated languages 
of Northern California: in the Pomoan languages; in Bodega Miwok, Lake Miwok, 
and Southern Sierra Miwok but not Central Miwok or Northern Sierra Miwok; 
in Hill Patwin but not its sister Wintu; in Maidu but not its sister Nisenan; in the 
Western dialect of Wappo but not the Southern (Napa) dialect. Many American 
languages contain loans from the European languages of colonists: French in the 
Northeast, French and Spanish in the Southeast, Spanish in the Southwest and 
California, and Russian in Alaska (Mithun 1999: 311-13). 

Among the shared words are items once thought to be unborrowable. Pronouns, 
particularly full paradigms, have sometimes been cited as indicators of deep genetic 
relationship. Yet Yuki and Wappo, two California languages, borrowed first and 
second person pronouns from the neighboring but unrelated Pomoan languages 
(Mithun 2008). There are even cases of borrowed bound pronouns. Alsea, a 
language of the Oregon Coast, contains subject enclitics attached to the first 
element of the clause. The full set of enclitics shows a perfect match with that 
reconstructed for Proto-Salishan, immediately to the north (Kinkade 1978). Of course 
the contact indicated by loanwords need not have been direct. Spanish loanwords 
in many California languages were not adopted directly from Spanish speakers 
but rather through the intermediary of other California languages. Importantly, 
the absence of loanwords does not necessarily indicate an absence of contact. 
Multilinguals sometimes take special pains to keep their languages distinct, often 
with a focus on vocabulary. 

Establishing contact as the source of structural similarities can be more 
challenging, particularly when no substance is involved. Chance can be a greater 
factor in structural parallelism than in shared vocabulary: many languages show 
basic verb-final clause structure, for example, simply because the alternatives are 
so limited. An important strategy for detecting contact-induced grammatical 
change, particularly morphological change, is the comparison of structures in genet- 
ically related languages spoken in different geographical areas. Features shared 
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by a language with its neighbors but not with its relatives outside of the area are 
more likely to be a result of contact. 

This situation can be illustrated with consonant inventories in the Algic family. 
The family consists of the Yurok and Wiyot languages on the northern California 
Coast, and the Algonquian group of nearly 30 languages distributed across 
the continent from Alberta, Montana, and Wyoming to the Atlantic Coast. The 
inventories of the two California languages differ strikingly from those of their 
Algonquian relatives. 


(1) Algic consonant inventories 
a. Yurok: 27 (Blevins 2003): 
pt Gk, k®, pt, OK, Kk, 5,48 x,m,n, 17,0, y,w, y w,'l, ry, y, 2h 
b. Wiyot: 25 (Teeter & Nichols 1993): 
DAC CR POL COC RR, oe Ura SL ee Y 
c. Proto-Algonquian: 13 (Bloomfield 1946) 
p, t, ¢, s,8,h, m,n, 0, 1, w,y 


The Yurok and Wiyot inventories resemble those of their Northern California neigh- 
bors, Chimariko and the Pacific Athabaskan languages (Hupa, Tolowa, Mattole, 
and Eel River dialects). The Chimariko inventory contains 33 distinctive con- 
sonants with plain, ejective, and aspirated obstruents, and front and back apicals: 
PHC GM Dot ak CC CRG ple PD OE (ee AC RW yoo Le es 
y, w (Jany 2007). The Pacific Athabaskan languages contain in addition a voice- 
less lateral and labio-velars: b, d, c, ¢ k’, ¢, q, t", c", &, 0, kl, kt #0, 6, KY, 
k’, q’, s, 4, 8, x", x,m,n, 4,1, y, w, w, nN, y,'l, 2, h (Golla 1996). There is clear con- 
sensus that there are no genetic links among the Algic languages and their 
Chimariko and Athabaskan neighbors. Northern California is known as an area 
of longstanding multilingualism. Communities have always been small and 
intermarriage has been the norm. The consonant inventories reflect this history. 

A number of fundamental grammatical structures show similar distributions, 
shared among neighbors but not among related languages outside of the linguistic 
areas. 


2 Patterns of Core Argument Structure 


On the basis of a survey of 174 genetically and areally diverse languages, Nichols 
(1992) proposes that core argument patterns, such as nominative/accusative, 
ergative/absolutive, etc. have “high genetic stability” and are potentially capable 
of revealing genetic relations more ancient than those recoverable through the 
comparative method: “Dominant alignment is genetically stable and not greatly 
susceptible to areal spread” (1992: 166). The proposal makes sense. Grammatical 
relations are typically coded by morphology, one of the most tightly integrated, 
systematic domains of grammatical structure, less accessible to the consciousness 
of speakers than independent words. Yet clusters of the core argument patterns 
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identified by Nichols as the rarest cross-linguistically appear in several geo- 
graphical areas of North America, often cutting across genetic lines. 


2.1 Semantically based systems 


The Athabaskan-Eyak-Tlingit languages are distributed over a large area from the 
Southwest through Alaska. All of the nearly 40 Athabaskan languages identify 
core arguments by pronominal prefixes in their verbs. Subject prefixes occur at 
the center of the verb immediately adjacent to the classifier + stem complex. (Basic 
third person subjects are zero.) Object prefixes occur further from the stem, 
potentially separated from it by various modal, aspectual, and adverbial prefixes. 


(2) Navajo pronominal prefixes (Faltz 1998: 112-13, 156): 

a. ha-n-sh-tteeh 
up.out-2SG.OBJECT-1SG.SUBJECT-CL.handle.animate.object 
‘I’m carrying you up.’ 

b.  ha-sh-ni-tteeh 
up.out-1SG.OBJECT-2SG.SUBJECT-CL.handle.animate.object 
“You’re carrying me up.’ 

c. ha-ni-tteeh 
up.out-2SG.OBJECT-CL.handle.animate.object 
(He/she) is carrying you up.’ 

d. ha-ni-d-eesh-téét 
up.out-2SG.OBJECT-FUTURE-1SG.SUBJECT- 
CL.handle.animate.object. FUTURE 


‘Tl carry you up.’ 
(3) Navajo subject sh- [s-] ‘T’ (Young, Morgan, & Midgette 1992): 
yi-sh-hadd ‘T shook it.’ (a rattle) 230 
‘adah ‘ii-sh-aah ‘I went down, descended.’ 664. 
ni-sh-chon ‘I stink.’ 82 
‘adadii-sh-nih ‘T got hurt.’ 456 


The Athabaskan languages are related as a group to the Eyak language of Alaska. 
Eyak subject pronominal prefixes, which are cognate with those in Athabaskan 
languages, also occur immediately before the classifier + stem complex. 


(4) Eyak pronominal subject x- ‘I’ (Krauss 1982) 


ich’ ganuh qu’-x-tah ‘T will show you.’ 42 
qe’ qu’-x-dage: ‘Iam going to boat back.’ 75 
Datli: a’q’ sixuhtktinu: ‘T already shoveled them out.’ 75 
gala-x-tah ‘Tam alive.’ 75 
qu’-x-sinh ‘T will die.’ 119 


Objects are represented by pronominal clitics preceding the full verb. 
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(5) Eyak object clitic sik’ah ‘me’ (Krauss 1982): 
sik’ah q’e’ sdile’kt ‘He released me.’ 43 
de:dal sixa’ k’usatyaht? ‘What's this interfering with me?’ 99 


The Athabaskan-Eyak group is related in turn to the Tlingit language of Alaska. 
Tlingit also contains a set of pronominal prefixes in the verb immediately before 
the classifier + stem complex, cognate with the subject prefixes in Athabaskan and 
Eyak. Again basic third persons are zero. 


(6) Tlingit pronominal prefix x- ‘T’ (Story & Naish 1973): 


kaxashxéet ‘I’m writing.’ 374 
xx’ kaxshaxéet ‘I’m writing a letter.’ 374 
xwaajak ‘T killed it.’ 377 
ktinax ooxdzikaa ‘Tm really lazy.’ 122 
yan sh kaxwyjix’akw ‘I’m sitting very comfortably, 

just the way I want to’ 53 


A set of pronominal clitics precede the verb. 


(7) Tlingit pronominal clitic xat ‘me’ (Story & Naish 1973): 


xat woositéen ‘He saw me.’ 384 
xat woodoowagwal ‘Somebody hit me.’ 366 
xat yawsitak ‘He poked me in the face.” 155 
xat woodoodzikéi ‘They paid me.’ 146 
tléil agé xat yayeeteen? ‘Don’t you recognize me?’ 169 


The Tlingit pronominals differ in a fundamental way from those in the 
Athabaskan languages and Eyak, however. While the Athabaska-Eyak pro- 
nominals show a clear nominative/accusative pattern, those in Tlingit show an 
agent/patient pattern. The Tlingit prefixes, like x- in (6), represent participants 
who typically instigate and are in control of situations: grammatical agents. The 
clitics, like xat in (7), represent those who are not in control but are significantly 
affected: grammatical patients. Some patients, like those in (7), would be categor- 
ized as direct objects in English or Athabaskan languages. Others, like those in 
(8), would be categorized as subjects. 


(8) Tlingit clitic xat ‘I’ (Story & Naish 1973): 


xat seiwa.at’ ‘I’m cold.’ 366 
xat woolitéesh ‘Ym lonesome.’ 127 
xat googanda ‘Tm going to die.’ 366 
xat woodi.éik ‘I was paralyzed, so shocked I couldn’t act.’ 145 
ktinax xat yanéekw ‘I’m real sick.’ 190 
xat kawdikei ‘T failed completely.’ 85 
yaa xat nadashan ‘Tm growing old.’ 141 


yées téel éetee-nax xat ya tee ‘I need new shoes.’ 139 
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kut xat woodzigéet ‘I was lost.’ 129 
xat k’eiwawdash ‘I yawned.’ 251 
xat oowakich ‘I sobbed.’ 201 
xat yadut’kw ‘I frequently have hiccups.’ 108 
xat woodzigit ‘I woke up.’ 241 
kindayigin xat woodzigéet ‘I fell flat on my back.’ 247 


Though a semantic basis can be seen to underlie the Tlingit system, speakers do 
not make online decisions about degrees of agency, control, or affectedness as they 
speak. The pronominal set associated with each verb is lexicalized. 

One of the patterns represents an innovation. Since the Athabaskan-Eyak lan- 
guages are related as a group to Tlingit, the innovation could have occurred 
in either branch of the family. (Recent work by Vajda (2008a) indicates that 
Athabaskan-Eyak-Tlingit is related to the Yeneseic languages of Siberia, but as 
reconstructed by Vajda, their common ancestor had not yet developed a full 
system of either type (2008b), so the Yeneseic languages provide no help here.) 
Suggestive evidence of the direction of shift can be found in a neighbor. 

Immediately to the south of the Tlingit are the Haida, who speak an unrelated 
language. Modern Haida territory was occupied until around 1700 by the Tlingit 
(De Laguna 1990: 203). De Laguna reports that there was intense Tlingit-Haida 
contact and intermarriage, and that “the Tlingit are known to have absorbed 
increments of Haidas and Tsimshians” (1990: 213). The two languages are quite 
different typologically. Haida pronouns are independent words or clitics rather 
than prefixes, and they show no similarity in form to those of Tlingit. They do, 
however, follow an agent/patient pattern. 


(9) Haida Agent/Patient system: 1SG Agent hl and Patient dii (Enrico 2003): 


hl sral-gan ‘T fixed it.’ 491 
Joe hl qing-gan ‘I saw Joe.’ 51 
‘laa hl st’ida-gan ‘IT warned him.’ 433 
hl ‘itj-angqasaa-ang ‘Tam going to go.’ 565 
dii ‘la gu’laa-gang ‘He likes me.’ 79 
dii-gingaan ‘la qeenggaa ‘He looks like me.’ 84 
dii hlrwaaga-ang ‘Tam afraid.’ 87 
dii rahgal-gang ‘Tam tired of it.’ 82 
dii gudang-gang ‘I want to.’ 71 
dii q’ud-ang-gan ‘T wasn’t hungry.’ 41 
‘laa-gingaan dii qeenggaa ‘T look like him.’ 84 


The evidence strongly suggests that the Tlingit system developed under Haida 
influence. 

There is no philological record of the transfer, but a likely scenario can be 
imagined. It is not uncommon cross-linguistically for third persons not to be men- 
tioned overtly in every clause, so long as reference is clear. Such a propensity can 
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even be borrowed (Myers-Scotton 2002: 210). The absence of overt third person 
reference can set the stage for the reanalysis of nominative/accusative systems 
as agent/patient systems and vice versa. Transitive clauses with a single overt 
object argument could be reinterpreted as intransitive clauses with a single patient 
argument, or the reverse. 


(10) (sUBJECT) (TR) VERB OBJECT PATIENT (INTR) VERB 
(It/something) scared me. <>I was/am scared 


Such a development could happen spontaneously in a language. It could also 
be stimulated by contact, as bilinguals strive to reconcile their two grammatical 
systems. 

The Tlingit-Haida parallel is not an isolated case. Clusters of agent/patient 
systems appear in several other areas in North America. The Wappo and Yuki 
languages of California mentioned earlier are distantly related to each other, 
but no further relationships have been identified. The first and second person 
singular pronouns in the two are nearly identical in form (borrowed from 
Pomoan). Third person pronouns, used only for emphasis, developed recently 
in each language from demonstratives. The Wappo pronouns show a nominative / 
accusative pattern. The Yuki pronouns show an agent/patient pattern, one which 
matches that of their Pomoan neighbors down to the finest detail (Mithun 1991; 
2008). 

Agent/patient systems also appear in the Southeast, Great Plains, and Northeast, 
in all languages of the Siouan-Catawba, Caddoan, and Iroquoian families, as well 
as in all languages of the Muskogean family and isolates Chitimacha, Tunica, 
Natchez, and Atakapa. Together these languages cover a wide area from Canada 
to the Gulf of Mexico, and from the Atlantic across the Great Plains. They also 
appear in the Pueblo Southwest, in languages of the Kiowa—Tanoan family as well 
as in dialects of the Keresan language. In some languages the pronominals are 
prefixes, in some suffixes, and in some both. The affixes also show no similarities 
in form across family boundaries. 

Nichols found agent/patient patterns rare cross-linguistically, occurring in just 
13.5 percent of her sample (1992: 101). This rarity, combined with the pervasive- 
ness of the agent/patient systems in North America, suggests contact effects. The 
most likely mechanism of transfer is not unusual: a reanalysis of clause structure 
by bilinguals seeking to reconcile the argument categories of their two languages. 


2.2 Hierarchical systems 


The rarest type of pattern found by Nichols is that termed hierarchical. She noted 
this pattern in just 5 percent of the languages she examined. 

The Wakashan languages are indigenous to the Northwest Coast of North 
America. In them, core arguments are identified by pronominal enclitics to the 
predicate, which is basically clause-initial. 
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(11) Ahousaht Nuuchahnulth clitic =s (Nakayama 2003; George Louie, 


speaker): 

walsirar=s ‘I went home’ 195 
wikraqa=s suutit wiighap ‘T will not harm you’ 383 
nafaa=s ‘T understood’ 169 
fuuyimtck*i=s ‘I was born’ 163 
Tiuhsiz=s ‘I cried’ 166 
lic imfrar=s ‘Tam old’ 451 
watsaap'at=s ‘They sent me home’ 167 
n’aacsaat=s q“ayac’tik?i ‘The wolf was watching me’ 383 


Clitic choice is not affected by transitivity, so this is not an ergative/absolutive 
system: the clitic =s ‘I’ appears in both ‘I went home’ and ‘I will not harm you’. 
It is not an agent/ patient system: the same clitic appears in ‘I was born’. It is not 
active/stative: the same clitic appears in ‘I am old’. But it is not nominative/ 
accusative either: the same clitic represents both subjects (‘I will not harm you’) 
and objects (‘They sent me home’). 

It is a hierarchical system. Only one argument is represented pronominally in 
a verb. The choice of argument depends on person, according to the hierarchy 1, 
2 > 3. If a first or second person acts on a third (1/3, 2/3), that first or second 
person is represented. If a third person acts on a first or second (3/1, 3/2), again 
the first or second person has priority (‘The wolf saw me’). 

One might wonder how speakers could distinguish ‘I found him’ from ‘He 
found me’. Nuuchahnulth has a suffix -’at, somewhat comparable to a passive 
in other languages. Agents may or may not be mentioned lexically in -’at 
clauses. 


(12) Nuuchahnulth -’at (Nakayama 1997: 168, 170): 

a. ha:fanfanits 
ha:han-‘at-it=s 
invite-PASSIVE-PAST=1SG 
‘I was invited.’ 

b. XK icifatta = mamatn’i 
xi-Ci-fat-Aa: mamatin’i 
shoot-MOMENTANEOUS-PASSIVE-also whiteman 
‘He was shot at by white men again.’ 


The -’at construction is used extensively for insuring that a continuing discourse 
topic is the core argument of the clause. It also functions to maintain the hier- 
archy. If a first or second person acts on a third (‘I found her’) the -’at construction 
cannot be used. If a third person acts on a first or second, the -’at construction 
must be used (‘I was found’). 

When a clause involves only first and second persons, just the agent is repre- 
sented by a clitic. The other participant may be identified in a separate word. 
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(13) Local relations (1/2, 2/1) (Nakayama 2003: 383): 
wikragas suutit wiigtap 
wik-faqa=s sut-cit wi:q-iap 
not-FUTURE-1SG you-doing.to unpleasant-do 
‘T will not harm you.’ 


There is only one context in which two arguments are identified by clitics. 
Special transitive pronominal enclitics representing combinations of first and 
second persons are used in imperatives. 

The hierarchical system cannot be reconstructed for Proto-Wakashan. To the 
south of Nuuchahnulth are the two other South Wakashan languages: Nitinaht 
(Ditidaht) and Makah. Both also show the 1, 2 > 3 hierarchy. Both maintain it by 
means of constructions cognate with the Nuuchahnulth -’at construction. The hier- 
archical system has not penetrated their grammars quite as thoroughly, however. 
Any time first and second persons act on each other, transitive clitics are used. 

The hierarchical system has been extended even less deeply in the three North 
Wakashan languages. Immediately to the north of Nuuchahnulth is Kwak’wala. 
In this language subjects are identified by enclitics, and objects by verbal suffixes. 
There is one gap in the pronominal paradigm: there are no first person object 
forms. In place of an inherited object form, a word based on the verb ‘come’ is 
used, or an oblique construction. North of Kwak’wala are Heiltsuk and Haisla. 
These two languages show no trace of a hierarchy. Full sets of subject and object 
pronominals exist and are used in all combinations. 

To the south of the Wakashan family is the Chimakuan family, consisting 
of Chemakum and Quileute. Documentation of Chemakum is sparse, but the 
Quileute system is clear. Arguments are identified by pronominal subject encli- 
tics and object suffixes, but not all subject/object combinations occur. There is a 
hierarchy: 2 > 3, also maintained through passivization, but the forms of the Quileute 
pronominals and passive suffixes are completely different from those of the 
Wakashan languages. 

West of the Wakashan and Chimakuan languages are the 23 Salishan languages. 
The northernmost Salishan languages Bella Coola, Comox, and Sechelt show no 
restrictions whatsoever on argument combinations. Immediately to the south along 
the coast, Squamish, Halkomelen, and the Saanich dialect of Northern Straits show 
a limited hierarchical system: 2 > 3. South of them, the Sooke and Lummi dialects 
of Northern Straits, and the Klallam language, show a more extensive hierarch- 
ical system, equivalent to those of their South Wakashan neighbors to the west, 
Nitinaht and Makah: 1, 2 > 3. None of the Salishan languages further south 
(Lushootseed, Twana, Quinault, Lower Chehalis, Upper Chehalis), nor those of 
the Interior, show hierarchies at all. 

The differences in the extent to which the hierarchical systems have penetrated 
the grammars of the different languages and dialects cut across genetic lines. The 
patterns show clear areal grouping, however, with the most extensive system, that 
of Nuuchahnulth, at the geographical core. But how could such abstract struc- 
tures be transferred without the morphemes that carry them? 
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The systems need not have been transferred in their modern forms. It is more 
likely that what was transferred was a precursor to the systems: a recurring 
stylistic choice. In languages with a grammatical subject category, certain kinds 
of participants tend to be preferred over others for this role. Animates tend to be 
preferred over inanimates, humans over nonhumans, first and second persons over 
third, agents over patients, given referents over new, and identifiable (definite) 
over indefinite. Such characteristics do not always coincide in a single participant: 
the speaker (first person) may not be a semantic agent, for example. Speakers of 
one of the Northwest Coast languages, perhaps Nuuchahnulth, may have tended 
to prioritize person under such circumstances, often passivizing clauses with third 
person agents acting on first or second person patients. This stylistic tendency 
could easily be transferred by bilinguals from one language to another. The struc- 
tural equivalences already existed in all of the languages: first, second, and third 
person pronominals, and passive constructions. What would have been transferred 
was the frequency of the structures. Recurring choices could become routinized 
and ultimately obligatory. (The systems are further described in Mithun 2007b.) 

Hierarchical patterns are found elsewhere in North America as well. An 
intriguing cluster is in northern California. There the mechanisms used to main- 
tain the hierarchies vary, drawn from various resources originally present in dif- 
ferent languages, but the resulting systems have begun to converge. 

Chimariko, an isolate, shows a strong hierarchical system. Verbs contain 
pronominal affixes with an agent/patient base. Most verbs appear with prefixes, 
but one set appear with suffixes. First, second, and third persons are distinguished, 
and singular and plural number. In addition, different pronouns distinguish first 
person singular agents and patients, and also first person plural agents and 
patients. Only one argument is represented within any verb. In transitive verbs, 
the choice depends on a 1, 2 > 3 hierarchy: speech-act participants have priority 
over others. Verbs with meanings like ‘I found him’ and ‘He found me’ both 
contain only a first person pronoun, but the difference between the two is clear 
from the form of the first person prefix. ‘I found him’ contains just a first person 
agent prefix; ‘He found me’ contains just a first person patient prefix. When both 
arguments are speech-act participants (‘I found you’, ‘You found me’), only the 
agent is represented in the verb. A second argument may be identified with an 
independent emphatic pronoun. Special transitive forms are used in imperatives. 

To the southeast of Chimariko is Yana. In Yana, core arguments are identified 
by pronominal suffixes on verbs. This system shows a nominative / accusative basis. 
The same pronominal forms are used to represent subjects of intransitives and 
transitives, semantic agents and patients (‘I pound it up’, ‘I am shaking with fear’), 
and those involved in events and states (‘I killed him’, ‘I am ugly’). The transi- 
tive pronominal suffixes are now fused complexes, but earlier internal structures 
can be detected. As in many pronominal affix paradigms, there is no overt 
marker for third persons. When a first or second person acts ona third (1/3, 2/3), 
the form is the same as for intransitives (1, 2). The third person object is simply 
not mentioned. When a third person acts on a first or second (3/1, 3/2), an 
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additional element -wa- appears, and the stem shows ablaut. The source of this 
-wa- element is a passive marker. Passive formation involves the suffix -wa(?a) 
plus ablaut. Pronominal suffixes representing combinations of first and second 
persons are fossilized, but they also contain recognizable elements. All include 
a remnant of the passive suffix -wa(?a). Various other markers have been added 
over time, apparently to clarify reference in potentially ambiguous situations. An 
element -ki- in forms involving first person plurals comes from a verbal suffix 
‘hither’. An element -wii- in combinations with second person plural subjects matches 
a noun plural. An element -m- in ‘T/you.all’ and ‘we/you.all’ is a second person 
pronominal apparently reinforcing the second person. 

There is a third hierarchical system in the area. To the west of Chimariko, on 
the Coast, is Yurok, an Algic language clearly unrelated genetically to either 
Chimariko or Yana. In Yurok, core arguments in indicative verbs are identified 
by pronominal suffixes. The suffixes generally show nominative /accusative pat- 
terning: transitivity, semantic role, and aspect make no difference. In transitive 
constructions, however, both arguments are not always represented overtly. In 
certain combinations involving third person patients, the third persons are not 
represented at all. In certain other combinations, there is obligatory passivization 
by means of the passive suffix -ey or -oy: nekcenoy ‘he/she meets us’ is literally 
‘we are met’ (meet-PASSIVE). Yurok thus shows some of the strategies at work in 
neighboring languages to ensure a person hierarchy, but they have not been 
extended through the full grammar. Both participants are still represented in the 
combinations 1SG/2S5G, 1SG/3SG, 1SG/2PL, 1SG/3PL, 2SG/1SG, 2/3SG, 3SG/1SG, 
1PL/2SG, 1PL/3SG, 2PL/1SG, and 3PL/1SG. There is a slight priority given to 
second persons: third person agents are never expressed in the presence of 
second persons. 

There is also a fourth hierarchical system in the area. The isolate Karuk is 
spoken to the north of Chimariko and immediately to the east of Yurok. Here 
arguments are identified by pronominal prefixes on verbs. The system has a nom- 
inative/accusative base. First person subject and object suffixes have different forms: 
ni-mmah ‘I see him’, nd-mmah ‘He sees me.’ But here, too, only one argument is 
expressed in a verb. As in the other languages, first and second persons are always 
chosen over third. Third persons are simply unmentioned. The difference in form 
between first person subjects and objects keeps roles clear for first persons. When 
a second person pronominal prefix represents an object, an inverse suffix -ap is 
added to the verb (Macaulay 1992). Interestingly, second person plurals are 
chosen over all other participants, resulting in the hierarchy 2PL > 1 > 2SG > 3. 
Speakers of other languages in the area, such as those of the Pomoan family, use 
second person plural forms for respect, particularly to elders and in-laws. 

Northern California thus provides another example of shared abstract struc- 
ture not transferred with substance. Chimariko, Yana, Yurok, and Karuk all show 
person hierarchies in their pronominal affixes on verbs. The forms of their 
pronominal affixes are different. Some are even prefixes while others are suffixes. 
The bases for the pronominal systems are different: Chimariko shows an agent/ 
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patient base, while Yana, Yurok, and Karuk show a nominative/accusative base. 
The person hierarchies are slightly different, and they have penetrated the 
pronominal paradigms to differing extents. Importantly, the hierarchies are 
insured by different mechanisms: differences in the forms of first person agent 
and patient or subject and objects markers, obligatory passivization, a directional 
marker ‘hither’, and an old inverse marker. The systems differ in their pathways 
and endpoints of development, but the similarities are striking. Given the over- 
all rarity of hierarchical systems cross-linguistically, and the longstanding, 
intense multilingualism in the area, there is every indication that the similarities 
are due to language contact. As in the Northwest, it is likely that the modern struc- 
tures were not transferred directly as abstract grammatical systems. Rather, what 
may have been transferred were their precursors, certain recurring patterns of 
expression, which subsequently crystallized in each language. (Further details are 
in Mithun in press). 


3 More General Morphological Structures 


A significant difference between multi-word sentences and polymorphemic 
words is the salience of their parts. Speakers of unwritten languages can typically 
isolate and identify individual words in sentences, but not necessarily mor- 
phemes in words. Recognition of bound morphemes is undoubtedly facilitated 
by such factors as clarity of morpheme boundaries, absence of extensive allomorphy, 
isomorphism between syllable and morpheme boundaries, and position at the 
edge of the word. There are some well-known cases where a particular affix 
has been transferred on the back of lexical items that contain it, such as French 
-age into English. But North America contains certain wide areas where abstract 
morphological structure is shared among genetically unrelated but geographically 
neighboring languages, without shared substance. 


3.1 Lexical suffixes 


All of the Wakashan languages of the Northwest Coast contain large inventories 
of suffixes with meanings typical of roots in other languages: meanings that often 
seem more concrete and specific than those usually associated with affixes cross- 
linguistically. Some of the suffixes have meanings expressed in other languages 
by noun roots, such as -sac ‘bag’, -sii ‘family’, -’aqs ‘woman’, -q?ich ‘year’, and 
-fin ‘costume’. Some are expressed in other languages by verb roots, such as 
-ha ‘buy’, -naga ‘use as bait’, -taqa ‘blame’, -ht ‘exit the woods’, -’atak ‘love, and 
-‘i's ‘copulate’. Some are expressed by adjectives, such as -ap’ti ‘coiled’, -siik” 
‘complete’, -aafax ‘destined for’, -?at ‘aware of’, and -isim ‘principal’. Some have 
adverbial meanings, many indicating locations or directions, such as -su:fis ‘far 
out at sea’, -a't ‘out of the woods’, -a:ci ‘in a bay’, -sput ‘between the legs’, -it ‘in 
the body’, -yin ‘at the bow of a boat’, and -saqa ‘under the covers’ (Stonham 2005). 
Some examples of their uses are below. 


Contact and North American Languages 685 


(14) Ahousaht Nuuchahnulth (Nakayama 2003: 323): 
suswiscara. 
sus-w’isc-’a?a’. 
swim-move.up.bank-on.rock 
‘They swam ashore onto the rocks.’ 


(15) Ahousaht Nuuchahnulth: Nakayama 2003: 378 


q’aaxaa haaw’itar tuchaa 
q’a:-rxa: ha:w’itat —tué-ha’ 
also young.man woman-buy 


‘On the other hand, a young man proposes 


fucacir haw'it?i == hak axnak?i. 

fu-ca-Gint haw’it?i =hak“at-na'k-?i. 
it-go.to-MomM chief-DEF daughter-having-DEF 
by going to the chief who has a daughter.’ 


It might be wondered whether these are indeed suffixes. Formally, there is no 
question about their status. The languages are uniquely suffixing, and these mor- 
phemes never occur at the beginning of a word. They always follow a stem. They 
can differ subtly from stems functionally as well. The languages generally con- 
tain stems with meanings similar to those of the suffixes. Alongside of the suffix 
-sac ‘bag’ Stonham lists the unrelated niisaak® ‘bag’; the suffix -sii ‘family’ and the 
word /ustagimt ‘family’; the suffix -‘aqgs ‘woman’ and the word ftuucsma ‘woman’; 
the suffix -‘aap ‘buy’ and the root maakuk ‘buy’; the suffix -taqa ‘blame’ and the 
root wisk ‘blame’; the suffixes -acist ‘on the sea’ and -c’a'tu ‘out to sea’ and words 
tup’at ‘sea’ and Aaaras ‘by the sea’. We expect affixes to have more abstract and 
general meanings than stems or words. Despite their concrete and specific trans- 
lations, many of these suffixes do have broader meanings than their stem coun- 
terparts. The suffix -sac is translated by Stonham variously as ‘vessel, dish, box, 
container’. The suffix -saqa ‘under the covers’ is also ‘under one’s clothing’, ‘in 
a shelter’, and ‘inside’. Still, the lexical suffixes found throughout the Wakashan 
family are typologically unusual for the concreteness of their meanings and the 
sizes of the inventories, numbering in the hundreds. 

What is even more surprising is that large inventories of similar suffixes 
appear in the neighboring families Chimakuan and Salishan as well. A sample 
of Quileute (Chimakuan) suffixes includes -takil- ‘footprint’, -wi:yi’- ‘wall’, -spe:- 
‘fire’, -t’to:- ‘dirt’, -sida ‘water’, -xa- ‘eat’, -k ‘go to a definite place’, t’s- ‘make’, 
-t’co’- ‘have inside’, -kits’ ‘dance, kick’, -qo- ‘make use of’, and -sqa- ‘carry’ 
(Andrade 1933: 262). A sample of Lushootsheed (Salishan) suffixes includes -us 
‘face, head, upper part’, -ac ‘hair of head, crest or topknot of bird, hackles of dog’, 
-i¢ ‘cover, surface, on top of, over, string, cord, spine’, -ul¢ ‘container, belly’, -irt 
‘baby, child’, -ik” ‘shirt’, -lc ‘round thing, money, curved objects’, -dup ‘ground, 
floor’, -igad ‘incline, slope, bank, hill’, -ucid ‘body of water’, -ac ‘tree, bush’, and 
-ic’at ‘clothe, wear, support from shoulder’ (Bates, Hess, & Hilbert 1994). In these 
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languages as well there is no question about the formal status of the suffixes. They 
never serve as the foundation of a word on their own, but must always follow a 
stem. They also tend to show more general and diffuse meanings than their root 
counterparts. 

The cross-linguistic rarity of affixes like these suggests that the parallelism 
is the result of contact. But the suffixes themselves have not been borrowed. 
How could such a deeply embedded morphological structure be transferred? The 
pattern may not have been transferred in its current state. The functions of the 
suffix constructions are strikingly close to those of compounds, in many cases noun 
incorporation. They are used to create lexical items, such as the Nuuchahnulth 
gawas-sac ‘salmonberry dish’ (‘salmonberry-container’) and tuchaa ‘propose’ 
(‘woman-buy’). They are also used to convey background information not 
worthy of the attention given to separate words. A likely origin of the lexical suffix 
constructions is in compounds. 

Compounding is common cross-linguistically and can be reconstructed for 
Proto-Salishan. A propensity for compounding could easily be spread by bilin- 
guals. As single words, compounds have just one primary stress. Over time, roots 
that occurred as unstressed members of a substantial number of compounds could 
undergo further phonological reduction, resulting in the lexical suffixes of today. 
The large inventories would be explained by the fact that the members of com- 
pounds are drawn from the full inventory of stems. Their relatively concrete mean- 
ings would be explained by the fact that they became bound while they still had 
the concrete meanings of stems. The fact that they do not designate syntactic argu- 
ments or specify particular syntactic roles also follows directly from an origin as 
members of compounds. (More detailed discussion is in Mithun 1997, 1998.) 


3.2. Manner and direction 


Another abstract morphological structure shows a wide areal distribution in 
the West, particularly in modern Oregon, California, and Nevada. It appears in 
languages of numerous distinct families, crossing boundaries between even the 
deepest hypothesized superstocks. 

Central Pomo, a northern California language seen earlier, contains a set of 
prefixes that occur pervasively throughout the verbal lexicon: 


(16) Some Central Pomo verbs 


VEC ‘stick together, be alongside of each other’ 

da-t’é:¢’ ‘push on something that sticks in your hand’ 

PEC ‘stick on with fingers, as chewing gum under a table’ 
ma-t’é:*’ ‘step on a nail or something that sticks in your foot’ 
ca-t’é:c” ‘sit on a thorn, put a patch on pants’ 

h-t’é:c’ ‘stick up a pole, pitchfork, shovel, etc. in the ground’ 
m-t'é:c’ ‘catch fire’ 

p'-t' 6c’ ‘hammer a nail into the wall, nail something on’ 
p'at’é:’ ‘something floating downriver gets stuck on the bank’ 
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shee’ ‘while one is drinking, something gets into the mouth that 
does not belong, such as a bug or dirt’ 
Sa-t'é:c" ‘stick a support, as a box, next to something long, like fence 


posts stored upright for use’ 


Such morphemes are sometimes called ‘instrumental prefixes’, because they sug- 
gest a means or manner of motion, but there is no explicit specification of the 
role of an entity beyond general involvement. They are not specifically nominal 
or verbal: often translations like either ‘with the foot’ or ‘by stepping’ would be 
appropriate. They can also co-occur with nouns specifying an instrument: 


(17) Central Pomo prefixed verbs with nouns 
Q"abéwi — e"ba:c’. 
g'abé=wi &"-ba:-c’ 
rock=with massive.object-split-SEMELFACTIVE.PERFECTIVE 
‘He cracked it open with a rock.’ 


These prefixes can be reconstructed for Proto-Pomoan. Prefixes with similar 
functions occur in a number of other families and isolates in the area: Yuman, 
Karuk, Yana, Palaihnihan, and Washo. All of these were included at one time or 
another in proposals for a larger Hokan stock by Sapir and others. But the 
prefixes are absent from other languages and isolates grouped as Hokan: Shasta, 
Esselin, and Salinan. They also occur in California languages not grouped as Hokan. 
They appear throughout the Chumashan family, in Wappo, and Yuki. They 
occur in some isolates and families hypothesized by Sapir to be part of a larger 
Penutian stock in California, Oregon, and Idaho: Maiduan, Klamath, Takelma, 
and Sahaptian (Sahaptin and Nez Perce). But they are absent from other families 
and isolates grouped as Penutian: Wintuan, Utian (Miwok-Costanoan), and 
Yokutsan. They even occur in a number of Uto-Aztecan languages of the area, in 
the Numic branch: Kawaiisu, Tiimpisa (Panamint) Shoshone, etc. The languages 
vary substantially in their inventories and the productivity of prefixes, and none 
of the forms themselves are cognate across genetic lines. 
Central Pomo also contains a set of directional suffixes: 


(18) Central Pomo suffixes 


Ca-w ‘run’ (one) 

Ca-:la-w ‘run down’ 

Ca-:gac’ ‘run up (as up a hill)’ 

Ca-c" ‘run away’ 

Ca-way ‘run against hither, as when a whirlwind came up to you’ 
Ca-:’w-an ‘run around here and there’ 

ca-mli-w ‘run around it (a tree, rock, house, pole etc.)’ 

ca-mac’ ‘run northward’ 

Ca-:q’ ‘run by, over (along on the level), southward’ 


cai-m ‘run over, on, across (as bridge)’ 
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The suffixes can appear in verbs containing the means /manner prefixes described 
above. 


(19) Central Pomo prefix—suffix combinations 


da-di-:la-w ‘push something over a cliff’ 

ma-di-:la-w ‘kick something over a cliff’ 

p"a-di-:la-w ‘slowly glide into a swimming pool’ 
p"-di-:la-w ‘jump down, over a cliff, into the water’ 
Ca-di-:la-w ‘chase (dog) downhill’ 

ba-di-:la-w ‘walk downhill singing’ 

-di-:la-w ‘carry something downhill in hands’ 
§-di-:la-w ‘carry something downhill by the handle’ 


Like the prefixes, these directional suffixes are pervasive throughout the verbal 
lexicon and can be reconstructed for Proto-Pomoan. 

Suffixes with similar meanings occur in other languages once proposed as part 
of Hokan: Karuk, Shasta, Palaihnihan (Atsugewi, Achumawi), and Yana. There 
is, however, no mention of them in other languages grouped as Hokan, even some 
that contain means/ manner prefixes, such as languages of the Yuman family and 
Washo. They appear in some isolates and families proposed as part of Penutian: 
Maidun, Klamath, and Sahaptian (Sahaptin, Nez Perce). But other languages and 
families grouped as Penutian lack them, including some that contain the prefixes: 
Wintun, Utian, Yokuts, Takelma, Coos, Siuslaw, and Alsea. Again, the forms are 
not shared across genetic lines. 

There is thus a widespread morphological structure, appearing with varying 
degrees of robustness over a large geographical area that extends over California, 
Oregon, and areas inland. The prefixes occur in over a dozen genetically distinct 
units, and the suffixes in seven. The functions of the prefixes and suffixes are 
strikingly similar, but the forms differ. An obvious explanation would be contact, 
but how could such abstract structure, below the level of consciousness of words, 
be transferred, particularly without the forms themselves? It seems unlikely that 
bilingual speakers would spontaneously create affixes in one of their languages 
by analogy with affixes in the other. 

Here again, the structures need not have been transferred in their modern state. 
It is more likely that the precursors to these structures were transferred. The most 
obvious precursors are particular compounding patterns. 

Prefixes at an early stage of the development can be seen in languages near the 
periphery of the area. All of the Uto-Aztecan languages show extensive com- 
pounding: Noun-Noun, Verb—Verb, and Noun—Verb compounds. Some languages 
of the Numic branch of the family show prefixed verbs as well: 


(20) Kawaiisu (Zigmond, Booth, & Munro 1991): 
Noun root mofo- ‘hand’ mofo-zigt ‘hand-wash’ ‘wash one’s hands’ 
mofro-para ‘hand-stir’ ‘stir by hand’ 
Prefix ma- ‘manually’  ma-gavi-  ‘manually-cut’ ‘break off’ 
ma-guri-  ‘manually-circle’ ‘stir by hand’ 
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A typical feature of nouns in compounds is the lack of a specific syntactic rela- 
tionship to the verb. As can be seen above, the noun root in Kawaiisu noun-verb 
compounds can represent entities in a variety of semantic roles, including that of 
an instrument. 

Kawaiisu also contains directional suffixes: 


(21) Kawaiisu directional suffixes (Zigmond et al. 1991): 
£ga- ‘enter’ £ga-kwee- ‘go in’ £ga-ki- ‘come in’ 
yaa- ‘carry one’ yaa-kwee- ‘takeone’ yaa-ki- ‘bring one’ 
hurma- ‘carry several’ hurma-kwee- ‘take several’ hu?ma-ki- ‘bring several’ 


Kawaiisu still has a verb root -kwee ‘go’ and a verb root -ki ‘come’. The directional 
suffix constructions appear to be descended from verb—verb compounds. 

The morphological structure so prevalent in the area today appears to have 
developed from compounding patterns. Bilinguals could easily spread a tendency 
to form noun-verb or verb—verb compounds with an initial member indicating a 
means or manner of motion, and a tendency to form verb—verb compounds with a 
second member specifying direction. Over time, frequently occurring unstressed 
elements of such compounds could be reduced to affixes like those seen through- 
out the area (Mithun 2007a). 


3.3. Morphological form: clitic structures 


The North Wakashan language Kwakw’ala shows a somewhat unusual mor- 
phological structure. Case is marked on demonstratives which precede the noun 
phrase. What is surprising is that the clitics are attached phonologically not to 
the following noun phrase in their scope, but to the preceding word, whatever 
its function. 


(22) Kwakw’ala nominative and accusative case (Boas 1911a: 557): 
do:x’warél=e Dzd:wadalalisa=x-a élkwa. 
dog"-’adela=e Dzawadalalisa=x-a elk” 
see-suddenly=NOM.PROPER NAME=ACC.COMMON-DEM blood 
‘Dzawadalalis [NOMINATIVE] saw the blood [ACCUSATIVE].’ 


(23) Kwakw’ala nominative and oblique case (Boas 1911a: 533): 
‘né:x/so:’la=e Q’amtalata=s Q’ane:ge:'lak” 
ne:x’-so:-’la=e Q’amtalata=s Q’ane:ge:'lak” 
tell-P ASSIVE=HEARSA Y=NOM.PROPER NAME=OBLIQUENAME 
‘It is said, Q’amtalat [NOMINATIVE] was told by Q’aneqe’lak” [OBLIQUE].’ 


The full clitic structure cannot be reconstructed for Proto-Wakashan. In the two 
other North Wakashan languages, Heiltsuk and Haisla, it occurs only before 
obliques. 
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(24) Heiltsuk (Rath 1981: 1.85): 
Déduqula wisma-xi w'dc’id-x hi=s dugvayua-xi. 
watch man-DEM dog-DEM DEM=OBL binocular-DEM 
‘The man watched a dog with [OBLIQUE] binoculars.’ 


Case clitics do not occur at all in the South Wakashan languages. 

Even more remarkably, this unusual clitic structure appears in languages in the 
unrelated Tsimshianic languages, immediately to the north of the North Wakashan 
languages. The systems are not identical: while the Kwakw/’ala enclitics show 
a nominative/accusative pattern, the Tsimshianic enclitics show an ergative/ 
absolutive pattern. Tsimshianic clitics are discussed in detail in Stebbins (2003). 


(25) Sm/’algyax (Coast Tsimshian) (Boas 1911b cited in Mulder 1994: 204): 
Da gwaant=ga ‘wii  gyisiyaask. 
then blow=COMMON.ABSENT.ABS great northwind 
‘Then the great northwind blew.’ 


(26) Sm/algyax (Boas 1911b cited in Mulder 1994: 81): 
Dm dzakda=sga gibaw=ga haas-ga 
FUT kill=COMMON.ABSENT.ERG wolf=COMMON.ABSENT.ABS dog-DEM 
‘The wolf will kill the dog.’ 


There is no doubt about the phonological bond between the clitics (called “con- 
nectives” in the Tsimshianic literature) and the preceding word. Dunn observes: 


In hesitating and pausing, speakers always tie the connective to the preceding word, 
that is, they always pause after a connective. They never continue a sentence (after 
a pause) by starting with a connective; they may repeat the last word before the pause 
but never just the connective. (Dunn 1979: 131-2) 


The cross-linguistic rarity of this structure suggests that the similarity is unlikely 
to be due to chance. The transfer of such a pattern of bound morphology seems 
at first unlikely. But again, the structure need not have been transferred in its 
modern form. 

A number of languages in North America show a recurring rhetorical struc- 
ture by which speakers manage the flow of information. Typically speakers intro- 
duce no more than one major new item of information at a time in a prosodic 
phrase (Chafe 1987; 1994; Pawley 2000). This information might set the scene, 
introduce a new participant, present a new event, etc. Particularly in predicate- 
initial languages like those of the Wakashan and Tsimshianic families, a prosodic 
phrase may consist of an initial verb that provides an outline of an event, 
followed by a demonstrative that functions cataphorically to promise further 
information to come, such as more precise identification of a participant. Such a 
structure can be seen in the South Wakashan language Nuuchahnulth. 


Contact and North American Languages 691 


(27) Ahousaht Nuuchanulth (Nakayama 2003: 586): 


histaqsixitquu tah, 
his-taq-8ix-it-qu: tah, 
there-come.from-MOM-PAST-COND.3 this 
hitagiit. 

hita-aqiit. 


there. MOM-in.a.sound 
‘He could have started from here — the Puget Sound area.’ 


A recurring rhetorical pattern of this type, where a demonstrative is grouped prosod- 
ically with the preceding material, could set the stage for morphological fusion. 
It would be easy to transfer a rhetorical pattern like that in (27) through contact: 
the initial predicates and demonstratives already existed in both languages. The 
social circumstances for transfer were in place as well. There was intense contact 
among Wakashan and Tsimshianic-speaking peoples, including intermarriage and 
extensive multilingualism, that extended into recent times (Codere 1990: 360; Halpin 
& Seguin 1990: 275-6; Hamoi-Torok 1990: 306; Hilton 1990: 314-17). Ceremonies 
involving elaborate oratory were also shared. 


4 Conclusion 


The Americas provide rich examples of effects of language contact, far beyond 
those mentioned here. In most cases, the details of these effects are not documented 
by a philological record comparable to those for certain languages of Europe, but 
comparisons of modern languages can be revealing. Many of the languages offer 
a look at the potential role of contact in stimulating the development of struc- 
tural parallelisms, even in the absence of borrowed words and morphemes. 
Considering the structures in diachronic perspective can often open the way to 
understanding the mechanisms by which such parallelisms can arise. Abstract pat- 
terns need not be transferred as such. Their development can have been set in 
motion by contact at an earlier stage, with the transfer of particular patterns of 
expression and frequencies of stylistic choices. 
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34 Language Contact in 
Africa: A Selected Review 


G. TUCKER CHILDS 


1 Introduction 


This chapter introduces readers to the immense variety of language contact situ- 
ations on the African continent. Simultaneously it underscores the need for more 
extensive and deeper research, investigating the sociohistorical conditions in 
addition to the linguistics, especially in the case of unwritten and undocumented 
languages. Any survey of the literature on language contact in Africa reveals the 
dearth of such information despite the abundance of linguistic analysis. At times 
linguists seem preoccupied with the unusual or exotic results of language con- 
tact much to the detriment of understanding how such results were achieved. 

Probably because of the pioneering classification of African languages by 
Joseph Greenberg (e.g. Greenberg 1963), the tradition in genetically classifying 
African languages has generally been one of lumping rather than splitting. 
Scholars have accepted his analysis of African languages into five different phyla 
and accepted most of the subdivisions, although there have been many recent criti- 
cisms (e.g. Dimmendaal 2001). 

The contact scene is wildly under-researched In addition to the generally 
expected reasons, one reason is that “divergence processes [are seen as] the 
paradigm scenario of language history” (Giildemann 2008: 184). Others include 
at least: Africa’s daunting field conditions, perhaps racism, multiple traditions 
in Europe proceeding from colonial rivalries, the lack of interest and resources 
on the part of Africans themselves, and the interest in theory and Amerindian 
languages by the North Americans (Childs 2003c). 

One tantalizing question that has been raised a number of times asks whether 
Africa as a whole constitutes a linguistic area. This would of course imply a great 
deal of contact over a widespread area likely for an extended period of time. That 
such a question has been raised so many times points again to the lumping 
tradition in African linguistic studies. It also suggests that contact might be just 
as important if not more important than genetic inheritance for understanding 
the structures and relatedness of African languages. A recent paper answers this 
question with a qualified “yes” using quantitative measures (Heine & Leyew 2008). 
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A paper in the same volume divides Africa into five areas on the basis of phono- 
logical features (Clements & Rialland 2008) and another recognizes a core area in 
the central Sudan (Giildemann 2008). A final paper examines the many claims for 
Ethiopia as a linguistic area, and concludes that indeed these early scholars were 
correct (Crass & Meyer 2008). Thus Africa has at least one agreed-upon linguistic 
area and several other proposals for linguistic areas. 

A claim for Africa itself as a linguistic area, however qualified, is difficult to 
assess. The relatively recent expansion of Niger-Congo and especially Bantu, has 
led to the phylum’s overweening influence on languages from other phyla with 
which it has been in contact. The spread of Bantu is undoubtedly one of the most 
momentous migrations in the (known) history of Africa. Its linguistic repercussions 
have been great, causing the disappearance of many languages and the creation 
of others. Another major force at work has been the rise of Sudanic empires and 
the spread of Islam. The types of contact that yield the most diverse varieties often 
involve significant disparities in power and resources, such as those listed below: 
(1) External trade, e.g., trans-Saharan, Indian Ocean 
Technological innovations, e.g., the spread of agriculture, iron-working 
African empire building 
European colonization 
Spread of external religions, especially Islam 


OO SD 


Little sociolinguistic or historical information such as this has been adduced in 
discussions of contact in Africa. This neglect illustrates how little attention lin- 
guists have paid to the nonlinguistic, however important it may be to understanding 
contact phenomena. Scholars with a more ethnographic orientation often criticize 
linguists for being too narrowly focused on purely linguistic phenomena, and such 
a charge seems legitimate here. 

Another general weakness in contact studies on the African continent is the lim- 
ited scope of the inquiry, usually being confined to just two languages. However 
high the quality of these characterizations, the generalizations reached often have 
little applicability beyond the immediate context. There are recent exceptions to 
this criticism, and in what I say below I expand the scope to sets of languages or 
language groups. 

This discussion begins by continuing to review the literature on language 
contact in Africa, an undertaking much facilitated by the recent appearance 
of two books on closely related topics. The major part of the chapter, however 
is devoted to a set of case studies selected to illustrate the types and range of 
contact phenomena found on the continent. 


2 Literature Review 


No overarching review of contact phenomena in Africa exists, in fact, such a review 
may not even be possible. However, a great number of publications are relevant 
from related areas: 
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(2) a. language classification 
b. historical and comparative linguistics 
c. the sociology of language, multilingualism, language maintenance and shift 
d. applied linguistics, language planning and language policy 
e. language endangerment and language death 


Two recent books discuss language contact as one of their major themes. Within 
both are extensive discussions of the literature on language contact in Africa, and 
the reader is referred to those books for further information than that presented 
here. In this section I will discuss the two (outside the literature reviews they pre- 
sent) and then a few other references also relevant to issues in language contact. 
These will be works that either appeared after these two books or ones that need 
further mention. 

The first is Roger Blench’s Archaeology, Language, and the African Past (Blench 
2006), a major work in several related fields. As can be inferred from the title, 
Blench’s main purpose is to show how a combination of primarily archeological 
and linguistic approaches can help reconstruct the history of the continent. Other 
fields from which he draws to achieve his synthesis are comparative ethnology 
and DNA testing. The way archeology can be used is obvious, but the linguistic 
analysis may need a word of explanation. Knowledge of language contact pat- 
terns is crucially part of the discovery process. The key element in the linguistic 
analysis is differential borrowings into genetically related languages (or a single 
language) as evidence of different periods of language contact. The picture 
emerges as a layering of linguistic influences, known as “stratigraphy.” The 
enormity of the task Blench has undertaken is daunting and his treatment is at 
the same time comprehensive and provocative. 

For example, Blench identifies a number of different linguistic areas in Africa, 
beyond those mentioned above: Central Nigeria, the Nuba Hills, Central Tanzania, 
the whole of Chad, much of Cameroon, and the Caprivi Strip in Namibia. He him- 
self recognizes the preliminary nature of the work in language contact (Blench 
2007: 69), an observation made in almost every work discussed here. 

Blench surveys many other contact phenomena. One that has engaged scholars 
for many years is the status of the Pygmy languages, which now seem to have 
no unique or unified status. One of Blench’s more controversial topics is his claims 
about the spread of noun class systems, seeing them as being reintroduced to the 
African continent, an argument too involved to be treated here. 

The second major book treating contact phenomena on the African continent 
is A Linguistic Geography of Africa (Heine & Nurse 2008). This work contains even 
more material of relevance to language contact than Blench’s. Bernd Heine and 
Derek Nurse, themselves two major figures in the field of African linguistics and 
language contact, have assembled an impressive list of contributors, most of whom 
are mentioned in this chapter. A longish article from the volume not mentioned 
elsewhere compares the typological features of Africa with those found elsewhere 
(Creissels et al. 2008). These authors contrast Africa with other parts of the world 
to suggest areal groupings, concluding that African languages possess an areal 
nature, as do Heine and Leyew (2008). 
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The other works discussed in this section come from Africanists with interests 
in all aspects of language description and linguistic theory. I have been able to 
treat only a sampling from the authors, all of whom have published much more 
extensively than can be discussed here. There is a slight geographical order to 
the discussion. 

One scholar making important contributions in East Africa and elsewhere is 
Martin Mous. His important work on Ma’a/Mbugu (Mous 1994; 2003; forthcoming) 
stands as a model for the detailed and sociohistorically informed treatment of a 
language and its nearby relatives. Another scholar dealing with East African lan- 
guages and with wide typological concerns is Gerrit Dimmendaal, who contributes 
an important chapter to the Heine and Nurse volume discussed in the preceding 
section, as well as an important overview of language contact and genetic classi- 
fication (Dimmendaal 2001). A third is Derek Nurse, much of whose work has 
been concerned with the Bantu languages of East Africa, especially with Swahili 
and related varieties. Although it is no longer controversial, due much to the 
careful scholarship of Nurse and his co-workers, Swahili was once felt to be the 
product of creolization, but this was not proven. Matthias Brenzinger has con- 
centrated on language death, especially as it occurs in East Africa. Two articles 
in an edited volume (Brenzinger 2007) focus on language endangerment both 
throughout the world and in several parts of Africa. Another important Bantu 
scholar writing on contact phenomena is Salikoko Mufwene, concentrating in par- 
ticular on Bantu in Central Africa (Mufwene 2001; 2008). Samarin (2008) also focuses 
on contact varieties in Central Africa. 

Concentrating his areal focus on the northern and central Sahel, especially on 
Songhai, Robert Nicolai has long worked on language contact and has recently 
launched an online journal, the Journal of Language Contact." 

Further south, there has been a wealth of publications on the transfer of clicks 
from Khoisan to Bantu, e.g., Herbert (1990a; 1990b), and much on the speech 
varieties of South Africa, especially on Afrikaans because of its many political 
associations. 

The work of Tom Giildemann has significantly contributed to our under- 
standing of contact phenomena all over Africa. He has looked at the Kalahari 
Basin in southern Africa, at northeastern Africa, and at the Sudan. He finds that 
the upper levels of genetic classification are questionable, and suggests that 
Greenberg’s higher genetic groupings may be geographical. “His ‘linguistic 
area’ (the term is used more loosely than elsewhere) in the Central Sudan is an 
‘innovation area’ that radiates features... primarily the result of geographical 
factors which have been relevant for a sufficiently long time period” (Gtildemann 
2008: 180). 

The next section presents a summary of the interaction between two groups of 
Niger-Congo, the largest phylum in Africa and the world. Both groups are con- 
centrated in the under-represented western part of West Africa and have been in 
contact for centuries. Their interaction ranges socially from partial to complete 
assimilation and linguistically from a few borrowings to language shift and lan- 
guage death. The section after examines South African contact varieties, and the 
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next section looks at some of the newest contact varieties on the continent, 
“arban youth languages” (Kiessling & Mous 2004). 


3. Atlantic and Mande 


Atlantic and Mande constitute at least two and possibly three or more early branches 
off the Niger-Congo stock.* The Atlantic Group is found in a broad swath along 
the Atlantic coast from Senegal to Liberia (the nomadic Fula extend further east). 
The unity of Atlantic, consisting of less than 50 languages, has long been ques- 
tioned, especially the relatedness of the northern and southern branches. Mande 
is a much more unified group, located further inland in an interior swath 
throughout the western Sahel. 

Historically and socioculturally the peoples are quite different, but with con- 
siderable convergence taking place over time with what has been called the “Mande 
Expansion” but which just as easily could be called the “Atlantic Retreat.” What 
is meant by these terms can be seen in Figures 34.1 and 34.2. In the first and 
more recent map is shown the entirety of the Atlantic territory today (excluding 
Fula). A brief examination shows that the majority of the area is covered by just 
two languages Wolof and Fula (numbers 1 and 2, including “2bis”). It is these 
two languages that have performed the same function as Mande, have become 
“predatory” languages (Blench 2006) conducting glottophagie (Calvet 1974) on 
Atlantic. 

Figure 34.2, on the other hand shows the Atlantic languages, minus Fula and 
Wolof, where the Atlantic languages are represented in striped areas. The gen- 
eral picture that emerges is that of smaller groups being pushed toward the sea 
and in a few places fleeing to the mountains. Several misrepresentations on this 
map show how further peripheralized and reduced the Atlantic languages are. 
I will mention only a few examples (the reader may want to refer back to Figure 
34.1 for a complete listing of languages): 


(3) a. “Diola” represents a cluster of 10 languages, several of which are 
threatened 
Cangin is similarly a set of 5 languages 

c. The inset area represents many other threatened languages (5-10). 

d. The area known as Sherbro is not in fact “Sherbro.” Most Sherbro 
speakers are today found on Bonthe Island in reduced numbers. The 
coastal Sherbro area is all Mende with several villages containing a few 
speakers of virtually moribund Atlantic languages: Kim (20 speakers) and 
Bom (300). 


Those groups that have preserved their history, e.g., the Mani (“Mmani” in 
Figure 34.2), recount stories of being pushed to the sea (Childs, in preparation). 
Known history reveals similar events. The Kisi fled earlier (sixteenth century) from 
the Mane invaders (Rodney 1967) and are now surrounded by Mande groups. 
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Figure 34.1 The Atlantic languages with Fula and Wolof (Segerer 2004) 


They have survived as a fairly viable group likely because of their isolation in 
the forests and in relatively inhospitable regions (Childs 1995a). Although no one 
is certain, several groups in the Southern Branch claim a northern origin; just as 
uncertain as the Atlantic homeland is the homeland of the Mande themselves 


(Blench 2006: 84). 
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Figure 34.2 The Atlantic languages (Wilson 1989) 
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The Mande expansion, however, took place in two stages, the first peaceful 
and gradual, the second more militant and concentrated. Mande penetration 
originally consisted of peaceful trade and settlement beginning in the early 
centuries of the Common Era, a steady influx of smiths and traders. The smiths 
importantly obtained power through their influence on the secret societies and 
the traders through their control of commerce and resources. The second phase 
of the Mande expansion was an 


era of conquest and state building by Mandekan [Mandeng] warriors that began 
during the [fifteenth century] ...the Mandekan (horse) warriors, who achieved 
their control in western Africa strictly through physical force and collaboration with 
the Mande traders and smiths, were already in place. The second phase brought 
far-reaching changes to western African peoples (Brooks 1993: 59). 


The conquest and subsequent social stratification had linguistic implications, 
according to Brooks. “With few exceptions the warriors spoke Mandekan languages 
that subsequently diffused among the conquered groups” (Brooks 1993: 97; cf. 
Murdock 1959: 267ff.). Earlier accounts stated the linguistic consequences more 
specifically. One branch of these warriors was the Manes, discussed in the quote 
below. Mende, Bandi, Looma, and Loko are all Mande languages; Temne is 
Atlantic. Here is the situation as characterized in Childs (2002): 


On linguistic grounds, Northcote Thomas linked the Mende with the Gbande 
[Bandi] and the Toma [Looma]; he suggested that “in the Mende we have the 
portion of the Manes who drove out the aborigines or completely dominated them; 
in the Loko, a tribe originally of the aboriginal stock but brought so completely under 
the influence of the Manes as to adopt their language instead of their own; and that 
the Temnes are also aborigines who were forced to take alien chiefs, but maintained 
in large measure their own culture, and in places won back from the invaders a 
portion of the territory the latter had subjugated.” (Thomas 1919, 1920; as in Rodney 
1967: 230) 


Linguistically, the groups have been said to be typological opposites (e.g. Wilson 
1989), but this statement needs revision in light of current findings (Childs 2002; 
2004a), much of which were reported at a recent workshop (2008).° The statement 
in the preceding paragraph suggests a possible motivation. There have been intense 
periods of contact, and because of the asymmetrical control of power and 
resources, the influence has gone one way. The major exception is the case of Fula 
to Jalonke on the Fouta Jallon in Guinea, in which case the influence went the 
other way, likely the consequence of the Jalonke being enslaved by the Fula 
(Friederike Ltipke 2003, p.c.). Thus, in these interactions one can see played out 
many of the types of contact found elsewhere on the continent. What makes this 
case particularly informative is that it takes place independently of European con- 
tact and thus could be representative of other indigenous phenomena. Another 
reason for examining Atlantic-Mande interactions is the gap in the literature. 
The treatment also involves language groups rather than individual languages. 
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Some of the general consequences of these sociohistorical factors can be seen in 
a number of structural effects, as well as in the march toward language death by 
many of Atlantic’s members. The list in (4) below represents a sampling of the results: 


(4) The retention of tone in Atlantic Childs (1995b) 
The Kisi lexicon Childs (2002) 
S-Aux-O-V-X word order’ in Kisi Childs (2003b) /Gtildemann (2008) 
Soso influence on Mani: calques, Childs (2004a) 
borrowings 


The most dramatic effect, however, is in the overwhelming trend of speakers 
of Atlantic languages to shift to Mande or to the more widely spoken Atlantic 
languages, whose speakers have generally assimilated to Mande ways. The 
majority of the Atlantic languages are less widely spoken and are threatened, the 
general case in Africa (Brenzinger, Heine, & Sommer 1991). (Only a few Mande 
languages are threatened, Valentin Vydrine 2004, p.c.). Little literature exists 
on the dire situation in Atlantic. Blench (2007) treats West Africa, but only the 
eastern portion. Childs 2008 represents something of a call to arms in terms of 
language documentation, and several oral presentations have also raised the alarm 
(e.g. Childs 2007). 

In historical times, several Atlantic languages have indeed died, with others 
being nearly extinct, e.g. Bom and Kim (Krim in the literature). Recent research 
on the two finds only 20 speakers of Kim and several hundred speakers of Bom. 
Another language recently documented is Mani (Mmani), which has fewer than 
300 speakers. As another example, the last speakers of Mo-peng had been sur- 
rounded and overcome by speakers of Bedik (Atlantic) and Mandinka (Mande) 
some years ago (Ferry 1975: 81-2). In 1993 Ferry could find no speakers of Baga 
Kalum (her “Baga Koba”), which used to be spoken in the area of Conakry. 
Language mixing and intertwining have likely occurred with Atlantic, although 
such claims, numerous as they have been in the older, nonlinguistic literature 
adduced above need to be tested. In North Atlantic at least the following are under 
immediate threat: Konyagi, Biafada, Kasanga, and Banyun. 

A language situation not unrelated to these is the case of (Ma)Banta/Banda, 
first reported as a dying language (Dalby 1963). Recent work by Martin Kailie 
confirms that “Banda” is only a dialect of Temne, albeit much influenced by Mende, 
used primarily for ritual purposes. Temne is the language from which the ances- 
tors of the present-day speakers originally switched. All speakers are reported to 
still speak a Temne-accented Mende, however, perhaps also a mark of identity. 
Presumably moribund varieties often persist as religious or ritual languages in a 
diglossic situation, providing them their only chance for survival. For example, 
some languages persist as they are the only ones that can reach the ancestors, 
the original owners of the land who must be propitiated from time to time. Such 
varieties may also persist as court languages. Interestingly, the Cangin languages 
(North Atlantic) survived as home languages in another diglossic situation, and 
were unknown until reported in Pichl (1966). 
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In summary, however, the Atlantic languages are severely threatened, with 
roughly 20 percent in an extremely dire state, likely to disappear in a generation, 
despite the possibility of a diglossic situation. The next two sections treat rela- 
tively new varieties arising in contact situations, most of them much more robust 
than the smaller Atlantic languages discussed in this section. 


4 Pidgins and Creoles 


Creolists’ have very often been in the vanguard of contact studies, even in 
Africa, just because the interactions and results are typically so telescoped. 
Unfortunately, however, their focus has been on the resultant variety rather than 
on the African substrate, and their knowledge of African languages is limited or 
even naive. Doneux (1999) is one of the few that has looked at a creole from a 
well-informed perspective in assessing the importance of Guinea-Bissau’s languages 
on the resident pidgin. The conclusion, however, to a detailed consideration of 
Africa as a linguistic area is that “creole languages do not exhibit any noticeable 
typological affinity with African languages on the basis of our survey data” (Heine 
& Leyew 2008: 35). 

As might be inferred on the basis of its history as the target of European expan- 
sionism and exploitation, the African continent has given rise to a number of 
pidgins, many of them persisting today as widely used pidgins or even creoles. 
In many countries the range of languages is more than what was once referred 
to as a “(post-)creole continuum” (DeCamp 1971). In many cases it is a speech 
economy featuring the entire set of substrate languages: a pidgin, an extended 
pidgin, a creole, the superstrate, and many varieties intermediate between these 
types. Thus, only for a relatively limited minority in each country’s metropolis 
is the restructured variety spoken as a first and only language. It is to these 
situations that this section turns its attention, identifying roughly comparable 
situations in Liberia and Sierra Leone and a slightly different one in Guinea. 

Africa is one of the few places on earth where there are still varieties that 
qualify as pidgins in the original use of the term as given in the following: 


(5) a. A contact vernacular, normally not the native language of any of its 
speakers, used in trading or a similar situation where its speakers have 
no common languages — functionally limited. 

b. A marginal language, the contact generally too specialized and the 
cultures and languages too widely separated for a lingua franca to arise, 
social distance maintained between cultures. Substrate languages not 
closely related. 

c. Linguistically characterized by a limited vocabulary, the elimination of 
inflections, a drastic reduction of redundant features, “simplification”; 
considerable phonological variation, an admixture of local vocabulary to 
meet the special needs of contact groups. 

d. Stable, has norms of meaning, pronunciation, and grammar (see Holm 
1989, for instance). 
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The lack of a common language leads to varying strategies, all of which, however, 
are united in their communicative goal; the result is a useful and successful vari- 
ety that has been called any number of things in the past but is now known as a 
“Medium for Interethnic Communication” (MIC; Baker 2000). What is important 
is to see these varieties as the result of creative strategies to communicate (Childs 
2004b). Still extant pidgins can be found in West Africa, e.g., Krio in Sierra Leone, 
Liberian English in Liberia, and even Tsotsitaal in South Africa. Many, however, 
become the primary language of their speakers, having in effect taken the next 
step and been creolized. 

In both Liberia and Sierra Leone the full range of varieties exist, primarily because 
of the presence of repatriated Africans and the longevity of their presence. In addi- 
tion to the varieties listed here, there are also local standard varieties of English, 
not mentioned here. The Liberian variety is normed on the American dialect of 
English, and the Sierra Leone variety on British English. 


(6) The restructured varieties of English in Liberia (see Singler 1997): 

a. Settler English: A creole spoken by the descendants of ante-bellum 
settlers from the Mid-Atlantic states of the US (3 percent of the total 
population of 2.18 million in 1984) who lived in and around Monrovia 
(306,000 inhabitants). 

b. Kru Pidgin English: Closer to West African Pidgin English (WAPE) 
because of distinct historical origins, “Kroomen” on ships plying West 
African waters. 

c. Liberian (Pidgin) English: Second language varieties used as a lingua 
franca throughout the rest of the country; heavily influenced by Mande 
languages; developed from those who joined the army and who worked 
on plantations. 


A roughly comparable situation exists in Sierra Leone. 


(7) The restructured varieties of English in Sierra Leone: 

a. Krio: The first and only language of the Krio population of Freetown, 
repatriated Africans, well educated and historically in control of the 
professions. 

b. Pidgin Krio: The form of Krio used up-country by speakers of indigen- 
ous Sierra Leone languages — in its most developed form close to the Krio 
of Freetown. 


The second set of circumstances is found much further south yielding contact 
varieties just as robust. The same inequities are extant, yet this time arising from 
Europeans who stayed. Only with the freeing of Nelson Mandela and the enfran- 
chisement of the majority population did South Africans receive independence 
comparable to that achieved by other African nations beginning in the 1960s. The 
result is a continuum of varieties with even greater articulation than that found 
in Guinea, Liberia, and Sierra Leone. 
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The former South African government was one of the most oppressive social 
regimes ever known, imposing apartheid on its people and keeping the majority 
population out of power. These social conditions have led to whole gamut of 
linguistic varieties. Dutch was, of course, the European language introduced with 
the arrival of the first Dutch settlers beginning in the seventeenth century. Most 
scholars agree that Afrikaans is an extreme form of Dutch heavily influenced 
by the contact with the original inhabitants. Kaaps is also a restructuring of 
Dutch but is now a language of identity for the people formerly known as the 
“Coloureds,” and not mutually intelligible with Afrikaans. Tsotsitaal is a mixed 
variety based on Afrikaans originating in the mines and flourishing in the 
Johannesburg townships until the imposition of apartheid and the “Removals” 
in 1954. Isicamtho replaced Tsotsitaal and is more of a Zulu slang. The latter two 
varieties are discussed in greater detail in the following section. 


(8) The South African continuum 
Dutch Afrikaans Kaaps Tsotsitaal Isicamtho 


On the linguistic side, as one moves from left to right, the varieties become less 
European and more African. On the social side, the varieties generally become 
less associated with overt prestige and more associated with covert prestige on 
the part of their speakers. 

What makes the contact situation in South Africa even more complicated is the 
presence of yet another pidgin which inverts the regular formula for pidgin for- 
mation, Fanagalo. What is unique about this variety, once widely used in the mines 
and still used on construction sites and in Indian shops, is that it was imposed 
from above by the mine owners and is lexified by an African language, a much 
simplified Zulu. Its structure comes entirely from European languages. 

Shading into and partially overlapping with these restructured varieties is a 
separate category of less restructured varieties used in Africa’s cities. 


5 Urban Varieties 


At least one important development in modern Africa is the rapid urbanization 
that forces contact between people from a great number of different backgrounds. 
The inequities they face and the lack of opportunity and access to power lead to 
a distancing but also to a solidarity among the dispossessed. Here the result is 
not an MIC - the goal is not the same as the pidgins discussed in the previous 
section. The result is an urban variety that reflects those experiences. “Urban 
varieties” have the characteristics given in the following: 


(9) Typical features of African urban vernaculars (from Childs 1998): 
Linguistic (a): Mixed or hybrid, often pidginized or originating as a pidgin; 
highly variable 
Linguistic (b): Grammar from the substrate rather than from the superstrate, 
but many superstrate borrowings 
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Historical: Receives input from, if not originates in, a criminal argot 

Social (a), Age: Spoken by the young, typically age-graded; may evolve into 
language change and even language shift 

Social (b), Gender: male oriented 

Acquisition: Learned from one’s peers 

Sociosymbolic value: Sophisticated city life 


Two urban varieties in South Africa are Isicamtho and its predecessor Tsotsitaal, 
from which Isicamtho must be differentiated. “Tsotsitaal” (lit. ‘thug talk’, also known 
as “Flaaitaal” or “Fly Taal”) originated in the mines of the Witwatersrand and is 
a dramatically restructured variety of Afrikaans once spoken in South African cities 
and serving the same functions as Isicamtho. 


The roots of Fly Taal go back to the 1886 gold rush in the Transvaal . . . [it] appar- 
ently emerged in the 1930s as a primary language of many of its users .. . Fly Taal 
apparently coalesced into its present form with the institutionalization of apartheid, 
when urban blacks were forced to move into segregated African townships such 
as Johannesburg’s South Western Township or Soweto, established in 1954. (Holm 
1989: 350-1) 


From a structural perspective Tsotsitaal has obvious creole features, ones absent 
in Isicamtho. Tsotsitaal has nearly disappeared since it is not recruiting new 
speakers (Childs 1994b); potential speakers use Isicamtho or similar varieties, which 
have replaced Tsotsitaal as the language of young black urban males. Isicamtho 
thus may represent the modern-day equivalent of earlier creoles, now arising in 
cities used by an economically deprived and alienated youth. The socioeconomic 
conditions in apartheid-bound South African cities are in many ways comparable 
to the harsh conditions of slavery. 


[Tsotsitaal] flourished in locations [townships] from Randfontein to Springs, 
Pretoria and Vereeniging. Former famous black residential areas like Sophiatown, 
Western Native township [Soweto], and Alexandria went under quaint names like 
“Kofifi,” “Kasbah,” and “Dark City.” ...It was the few from Jo’burg and the Reef 
who spoke it and were looked upon as strange people. (Bikitsha 1982) 


Isicamtho, however, is clearly Zulu and has resulted from the alienation of urban 
dwellers from the Afrikaans-speaking minority. The considerable differences 
between Isicamtho and Tsotsitaal in syntax and morphology are illustrated in (10). 
Word order is vastly different: Tsotsitaal is head-final with a copula; Isicamtho 
is head-initial with a verbal locative (no copula). Note also the difference in order 
of the dependent elements within the subject noun phrase. ‘Three’ in Isicamtho 
is a verbal construct (as is the locative). The Isicamtho version has typical Bantu 
agreement markers on ‘his’ ba-khe, on the “relative” ‘three’ aba-yi-three, and on 
the deficient verb be-be-daa, markers not found in the Tsotsitaal equivalent. The 
shared form in (10) is the word for ‘there’ (za, da vs. -daa (< Afrikaans daar ‘there’). 
In Tsotsitaal it is a locative adverb while in Isicamtho it is a (bound) “deficient” 
verb preceded by two verbal Zulu-like prefixes (be-be-). 
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(10) Isicamtho and Tsotsitaal, ‘Three of his friends were there.’: 
Tsotsitaal Hom drie vrine was za. 
Som drie chomis was da. 
his three friend be there 
Isicamtho Abobra bakhe abayithree bebedaa. 
friend his three be-there 


Although Isicamtho has some functional and linguistic ties to Tsotsitaal, it 
should be understood as distinct since it is clearly a dialect or slang of Zulu (Mfusi 
1990; Childs 1997). 

The phenomenon which Isicamtho exemplifies is widespread in Africa, i.e. the 
solidification of an urban consciousness through language. These are at least some 
of the same forces at work in the formation of creoles, particularly among those 
to whom access to the elite or superstrate culture has been denied (divergence in 
a conflict model, Rickford 1986). Speakers of Isicamtho mark themselves off as 
nonrural by the speech variety they use. Speakers further mark themselves off as 
male, perhaps criminal, etc. (see (9) above). What makes the South African situ- 
ation different from the rest of Africa, and perhaps unique, is the oppressiveness 
of apartheid and the futility of life for young urbanites, felt particularly strongly 
by males. These forces account for the change of Isicamtho from a restricted 
criminal argot to the main variety of an alienated youth. To what extent this 
alienation continues with majority rule in South Africa may well determine the 
variety’s future. 

Guinea French, at least one variety of which is spoken in the metropolis, is 
illustrative of what can happen with a colonial legacy. Several orally presented 
papers summarized in Childs (2003a) discuss the status and nature of French in 
Guinea. Guinea has a special history vis-a-vis the other former French colonies; 
Sekou Toure famously said “Non!” to continued relations with France, and the 
French did everything they could to undermine his regime. What has happened 
in Guinea, then, exemplifies well what happens when an external norm is 
abruptly removed. 

As exhibited in (11), the first two restructured varieties qualify as pidgins and 
have many features in common despite their different provenances. In addition 
there once was a French much closer to the European variety known to profes- 
sionals and used by them in formal situations or when there was no common 
language. People in Guinea have no inhibitions using the three widely spoken 
languages Malinké, Fula, and Soso, the language of political power, and thus rarely 
use French. A study at the Malinké capital of Kankan confirmed this observation. 
The last variety is the one relevant to the discussion in this section. 


(11) Subvarieties of Guinea French (Childs 2003a): 
a. Soldier French (frangais tirailleur): a variety used by older individuals 
who had ‘joined’ the French army 
b. Commercial French (francais commercant): a variety based on no formal 
education, used by merchants, drivers, etc. 
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c. School French (francais scolaire): clearly an incompletely learned French, 
likely the result of insufficient exposure to any variety of French 

d. Urban Guinean French (francais urbain), the variety used in Conakry and 
learned on the street, an urban lingua franca 


Urban Guinean French (along with other French varieties) performs the same func- 
tions as Isicamtho and many other urban vernaculars, such as those given below, 
(12) Lingala and especially its convergence with Swahili in Zaire 
Sango in the Central African Republic 

Nairobi Sheng 

Indoubil in Zairean cities 

(Town) Bemba in Zambia 


eae op 


This list confirms the presence of many other similar varieties as well as their long- 
established presence (see Kiessling & Mous 2004 for even more examples). 

In summary, these urban varieties function similarly to mark off their speakers 
as nonrural and part of an urban culture. This is confirmed by the absence of 
ideophones in these varieties, a robust category elsewhere in Africa (Childs 
1994a). It has been shown that ideophones, a quintessentially locally oriented word 
category, disappear in restructured varieties when the speakers wish to disavow 
a local identity (Childs 1996; 1998). 

The linguistic source of these varieties will vary. When there is the legacy of a 
colonial language, it serves as the basis for the lexicon, as is the case with Urban 
Guinea French. In the absence of such a variety or when that variety symbolizes 
oppression (Afrikaans in South Africa), urban youth will creatively use whatever 
linguistic resources they have at their disposal. In the case of Johannesburg it is 
Zulu, and Isicamtho is the result. 


6 Conclusion 


This chapter has presented some of the diversity of contact phenomena in Africa, 
looking at contact between African languages and between African and colonial 
languages. The concentration has been less on micro-linguistic consequences but 
rather on the causes and the effects on a societal level. That there is much work 
to be done is evident from the discussion. The emphasis of future research 
should be on the sociohistorical side. 

On the historical side linguists need to become better informed in the ways illus- 
trated in Dimmendaal (1998), Mous (2003), and Kiessling, Maarten, and Nurse 
(2008). In the latter work, for example, we are provided with historical dates, 
population figures, data on changes to the group’s size, as well as the types of 
societies and prestige and power relations (Kiessling et al. 2008: 187-9). In other 
cases we are provided merely with the linguistic contact information and left to 
infer the history. Because the history is not well known, other techniques will have 
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to be used, as demonstrated in Blench (2006). If linguists can become more open 
to related fields, the progress in understanding both current and historical cases 
of language contact will be considerable. 


NOTES 


1 www,jlcjournal.org. 

2 The former “Atlantic Group” has been proposed as consisting of three separate 
branches “North Atlantic,” “South Atlantic,” and an isolate Bijogo (Blench 2006). I accept 
this division but for heuristic purposes refer to the assemblage as the “Atlantic 
(Group)” in what follows. 

3 “Documenting convergence and diversity - Mande and Atlantic languages in contact,” 
SOAS, University of London, September 6-9, 2008. 

4 Childs (2003b) presents evidence for an internal scenario for the appearance of the ‘O’ 
in a split predicate, but the cause may be contact with Mande, whose languages are 
overwhelmingly S-Aux-O-V (Giildemann 2008). 

5 The terminology here follows traditional usage in the interests of clarity for the reader 
(see Jourdan 1991 and Holm 1988). 
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35 Contact and Siberian 
Languages 


BRIGITTE PAKENDORF 


This chapter provides a brief description of contact phenomena in the languages 
of Siberia, a geographic region which is of considerable significance for the field 
of contact linguistics. As this overview cannot hope to be exhaustive, the main 
goal is to sketch the different kinds of language contact situation known for 
this region. Within this larger scope of contact among the languages spoken in 
Siberia, a major focus will be on the influence exerted by Evenki, a Northern 
Tungusic language, on neighboring indigenous languages. 

The chapter is organized as follows: after a brief introduction to the languages 
and peoples of Siberia (section 1), the influence exerted on the indigenous 
languages by Russian, the dominant language in the Russian Federation, is 
described in section 2. This is followed by a short description of the two pidgins 
and one mixed language known from Siberia (section 3). The mutual influences 
at play among the indigenous languages of Siberia are illustrated with three short 
case studies of Evenki influence on its neighbors (section 4), followed by some 
concluding remarks in section 5. 


1 The Languages and Peoples of Siberia: 
Introduction 


Siberia is the vast geographic area that dominates the Eurasian landmass, bor- 
dering on the Ural Mountains in the west, the Arctic Sea in the north, the Sea of 
Okhotsk and the Pacific Ocean in the east, and northern China, Mongolia, and 
Kazakhstan in the south. In Russian usage, however, the regions bordering on 
the Sea of Okhotsk and the Pacific Ocean are generally excluded from Siberia proper, 
often being classified as the Far East instead (Encylopedia Britannica 1998, vol. 10: 
776; Severnaja Enciklopedija 2004: 226). 

Siberia is characterized by a severely continental climate, with very cold win- 
ters (temperatures in January average between —30°C and —40°C in most areas, 
and can reach —60°C and more in parts of the northeast) and hot summers (with 
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temperatures in July reaching +30°C and more; Brockhaus 2001, vol. 20: 160-1). 
The vegetation mainly consists of dense coniferous forest (taiga), with a forest- 
steppe and steppe zone along the southern border and a belt of tree- and shrub- 
less tundra along the northern edge (Encylopedia Britannica 1998, vol. 10: 776; 
Brockhaus 2001, vol. 20: 161). Due to the severe climatic and ecological conditions, 
Siberia is extremely sparsely populated, with population densities averaging less 
than two persons per km? (Severnaja Enciklopedija 2004: 616). Such low population 
density may have precluded frequent contact among the indigenous ethnolinguistic 
groups, especially in the past (cf. Stern 2005b: 290). Siberia is therefore not the 
first region of the world that comes to mind when studying language contact; 
nevertheless, the indigenous languages show several structural similarities, leading 
Anderson (2004; 2006) to speak of a “Siberian linguistic macro-area.” 

Over 30 languages belonging to 8 language families are spoken in Siberia. 
Nowadays, two of these families (Yeniseic and Yukaghir) are represented by only 
one or two daughter languages, while in the Far East the isolate language Nivkh 
is spoken. The language families found in Siberia are (following a rough west to 
east orientation): Uralic, the nearly extinct Yeniseic family, Turkic, Tungusic and 
Mongolic (these three are sometimes classified as belonging to the Altaic language 
family, e.g. Georg et al. 1998), the very small and nearly extinct Yukaghir family, 
Chukotko-Kamchatkan, and Eskimo-Aleut. The Yukaghir family (of which 
nowadays only two highly endangered languages survive, Tundra Yukaghir 
and Kolyma Yukaghir) might possibly be distantly related to the Uralic languages 
(cf. references in Maslova 2003a: 1). In addition to the isolate Nivkh, a further 
isolate, Ainu, used to be spoken in southern Sakhalin, on the Kurile Islands, and 
on the southernmost tip of Kamchatka. However, following World War II all 
Ainu-speakers moved to Japan (de Graaf 1992: 186). Table 35.1 presents a list of 
the languages currently still spoken in Siberia; their geographic distribution is shown 
in Figure 35.1. 

As mentioned above, Anderson (2004; 2006) speaks of a linguistic area with 
respect to the languages of Siberia. Typological features well known to be shared 
by a number of the languages are a system of vowel harmony, agglutinative mor- 
phology, relatively large case systems, predominantly SOV word order, and the 
widespread use of converbs or case-marked participles to mark subordination 
(Anderson 2004: 36-40, 65-9; 2006; Comrie 1981: 59, 71, 117, 244, 246, 258). 
Among other features described by Anderson as characterizing the Siberian 
linguistic area are a four-way distinction between labial, alveodental, palatal and 
velar nasals, a morphologically marked reciprocal voice, a distinction between a 
comitative and an instrumental case, and a distinction between a dative and an 
allative case (Anderson 2006: 268-73, 279-92). However, the distinction between 
an allative and a dative case proposed by Anderson appears to be characteristic 
of the Tungusic language family alone, not a widespread areal feature. Apart from 
the Tungusic languages, only a few languages at the margins of the geographic 
area show this distinction, such as one dialect of the Samoyedic language Selkup, 
the South Siberian Turkic languages Khakas and Tuvan, and the Chukotko- 
Kamchatkan language Koryak. In contrast, the majority of languages spoken in 


Table 35.1 The languages of Siberia, their linguistic affiliation and 
approximate number of speakers (based on 2002 census figures)* 


Family (and subfamily) Language Number of speakers 
Uralic (Ob-Ugric) Khanty 13,568 
Mansi 2,746 
Uralic (Samoyedic) Nenets 31,311 
Enets 119 
Nganasan 505 
Selkup 1,641 
Yeniseic Ket 485 
Turkic Siberian Tatar - 
Chulym Turkic 270 
Tuvan 242,754 
Tofa 378 
Khakas 52,217 
Shor 6,210 
Altai 65,534 
Sakha (Yakut) 456,288 
Dolgan 4,865 
Tungusic (Northern) Evenki 7 584 
Even 7,168 
Negidal 147 
Tungusic (Southern) Udihe 227 
Oroé 257 
Nanay 3,886 
Orok (UI'ta) 64 
Uléa 732 
Mongolic Buryat 368,807 
Yukaghir Kolyma Yukaghir 604° 
Tundra Yukaghir 
Chukotko-Kamchatkan Chukchi 7,742 
Koryak 3,019 
Kerek 15 
Alutor 40 
Itelmen® 385 
Eskimo-Aleut Eskimo languages 410° 
Aleut 175 
Isolate Nivkh 688 


* These numbers are certainly largely overestimated, since individuals frequently name 
their heritage language as their “mother tongue” when asked, even when they do not 
actually speak the language any more (cf. Vaxtin 2001: 77-8). Thus, estimates of 
numbers of speakers based on sociolinguistic data are much lower for most of the 
languages of Siberia, the vast majority of which are on the verge of extinction (cf. Vaxtin 
2001: 163-80; and Kazakevié & Parfénova 2000: 283-5; Nikolaeva & Tolskaya 2001: 25-6 
for individual linguistic groups). 

> This subgroup of Tatar speakers is not listed separately in the census. 

* In the 2002 census the two Yukaghir languages were not distinguished. 

A sociolinguistic survey conducted in 1987 counted 29 speakers of Kolyma 

Yukaghir and approximately 90 speakers of Tundra Yukaghir (Maslova 2003a: 2). 

¢ Although Itelmen is generally classified as belonging to the Chukotko-Kamchatkan 
languages (e.g. Comrie 1981: 240), this is not undisputed; an alternative hypothesis 
suggests that the similarities with Chukchi and Koryak are due to areal influences 

(cf. Georg & Volodin 1999: 224-41). 

* The census gives a joint number for “Inuit, Sireniki, and Yuit.” Inuit belongs to the Inuit- 
Inupiaq subgroup of the Eskimo languages, while Sireniki and Yuit (also called Central 
Siberian Yupik) are languages belonging to the Yupik subgroup (de Reuse 1994: 1-2). 
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Siberia (i.e. the Samoyedic languages Nenets, Nganasan and most dialects of Selkup, 
the Ob-Ugric languages Mansi and Khanty, the Mongolic language Buryat, the 
Turkic languages Tofa, Sakha, and Dolgan, the Chukotko-Kamchatkan languages 
Chukchi and Itelmen, as well as Ket and Yukaghir) use only one case to mark 
both indirect objects, addressees of verbs of speech, and goals of motion. It might 
therefore be preferable to speak of the lack of a distinction between a dative and 
allative case as being typical of this area. 

The indigenous groups of mainland Siberia were for the most part nomadic 
hunters and gatherers or semi-sedentary fishermen; along the Pacific coast and 
the Sea of Okhotsk, a number of groups were sedentary hunters of large sea mam- 
mals. In the southern steppe zone, on the other hand, cattle and horse pastoral- 
ism prevailed; this mode of subsistence was imported to northeastern Siberia in 
relatively recent times by the Turkic-speaking Sakha (Yakuts). Apart from the 
cattle and horses predominant in South Siberia, animals kept in this region are 
dogs and domesticated reindeer. Dogs are used mainly for help with reindeer herd- 
ing in western Siberia, and as a means of transport and for hunting in eastern 
Siberia. In the tundra zone, domesticated reindeer furnished all the necessities of 
life, while in the forest zone reindeer are kept chiefly as a means of transport, 
with subsistence based on hunting, fishing, and gathering (Severnaja Enciklopedija 
2004: 262-3, 686). 

Little is known about contact between different ethnolinguistic groups before 
Russian colonization, which started at the turn of the sixteenth and seventeenth 
centuries. Sporadic warfare and territorial conflicts, exacerbated by the upheavals 
following Russian colonization, are known to have taken place between different 
peoples of Siberia (Forsyth 1992: 11, 58, 80; de Reuse 1994: 296; Slezkine 1994: 27-8); 
these often resulted in the capture of women from the defeated enemy (Forsyth 
1992: 67; Slezkine 1994: 6, 44). Some trade relations existed in the eighteenth and 
nineteenth centuries between the nomadic reindeer-herding Chukchi and their 
neighbors, from the Yukaghirs, Evens and Sakha (Yakuts) in the west to the Eskimos 
in the east (de Reuse 1994: 296, 307; Maslova & Vaxtin 1996: 999), as well as between 
the coastal Chukchi and Koryaks and their reindeer-breeding compatriots from 
the interior (Forsyth 1992: 72). In the nineteenth century, the Turkic language Sakha 
played an important role as a vehicular language in large areas of northeastern 
Siberia (Wurm 1996: 976), while in Chukotka and Kamchatka Chukchi was in use 
for interethnic communication by Eskimos, Evens, and Kereks (Wurm 1992: 250; 
de Reuse 1994: 296; Burykin 1996: 990). Some cases of language shift have been 
documented, such as the shift of Samoyedic and Yeniseic speakers to Turkic 
languages in South Siberia, and the shift of Evenks to Buryat (Forsyth 1992: 23; 
Anderson 2004: 6; Slezkine 1994: 28; Cydendambaev 1981; Cimitdorzieva 2004). 
Nowadays, speakers of Evenki and Even dialects in the Republic of Sakha 
(Yakutia) are under strong influence from the locally dominant language Sakha, 
leading to numerous contact-induced changes in the Tungusic languages and lan- 
guage shift to Sakha (Malchukov 2006). Multilingualism is recorded for speakers 
of the Eskimo languages Naukan and Sireniki, who were fluent in Chukchi 
and other Eskimo languages (de Reuse 1994: 306), while Yukaghir-Even-Sakha— 
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Chukchi quadrilingualism existed in northeastern Yakutia from the nineteenth 
century, and perhaps earlier, up to the 1940s (Maslova & Vaxtin 1996: 999). However, 
it is not known whether such multilingualism would have been characteristic of 
interethnic relations in precolonial times as well. 

Russians first entered Siberia in the late sixteenth century, with garrisoned forts 
established on the Irtysh river in 1586 and 1587, on the Yenisey river in 1604, on 
the middle Lena in 1632, and on the Anadyr river in 1649 (Forsyth 1992: 34, 36, 79). 
Further small outposts were scattered in between to aid in the collection of fur tax. 
During the first centuries of colonization, Russian interference in the life of the 
indigenous peoples consisted predominantly in the collection of fur tax, the con- 
scription of indigenous peoples into providing transportation for Russian officials, 
as well as superficial Christianization (Gernet 2007: 69-72; Slezkine 1994: 23-4, 
32, 43-4, 48-53). Although by the end of the seventeenth century there may have 
been as many Russian settlers as indigenous peoples in Siberia, these immigrants 
were concentrated in the more fertile southern districts of Western Siberia (Forsyth 
1992: 100). In the northern and eastern regions Russians were scarce and often 
outnumbered by the local people (Forsyth 1992: 101; Stern 2005b: 292). Therefore, 
a knowledge of Russian among the indigenous groups was not very widespread 
during the tsarist period of colonization (cf. Mati¢ 2008: 100; Burykin 1996: 994). 

That situation changed, however, after the establishment of Soviet rule in the 
1920s. In the initial years the Soviet state encouraged the maintenance of the indigen- 
ous languages, and a number of orthographies were created for the unwritten 
languages of Siberia. However, at a later period, especially in the 1960s and 1970s, 
language policies changed drastically, and children of indigenous minority peo- 
ples were forcibly taken to boarding schools where they were forbidden to speak 
their native languages. Furthermore, after World War II large numbers of settlers 
from the European parts of the Soviet Union (especially Russians, Ukrainians, and 
Belorussians) came to Siberia to exploit the natural resources, so that the indigen- 
ous peoples were greatly outnumbered by the settlers (Forsyth 1992: 360, 361, 
405). This led to a large-scale Russification of all spheres of life (Helimski 1997: 
77; cf. de Graaf 1992: 190, 191 specifically for the Nivkh; Anderson 2005: 125-7 
for the Khakas). 

Nowadays, the majority of Siberian indigenous languages are moribund, with 
only a few elderly speakers remaining, and no more acquisition by children (Vaxtin 
2001: 163-80). Only a few of the larger ethnic groups have been able to maintain 
their heritage language in a viable state, for instance the Turkic-speaking Sakha 
(Pakendorf, field observation), or the Samoyedic-speaking Nenets (Ljublinskaja 
2000: 312; Vaxtin 2001: 163). 


2 Russian Influence on the Indigenous Languages 
of Siberia 


As mentioned in section 1, several factors have led to the widespread use of Russian 
among speakers of indigenous Siberian languages: Firstly, since Russian was the 


720 Brigitte Pakendorf 


predominant language in the Soviet Union, and is the language used in prac- 
tically all spheres of public life in the Russian Federation, a good knowledge of 
Russian was and is expected to lead to upward social mobility and better job chances 
(Comrie 1989: 146; Kazakevi¢ and Parfénova 2000: 288). Secondly, Russian func- 
tions as a lingua franca between individuals from diverse ethnolinguistic groups, 
and is used as the medium of communication in mixed marriages, even when it 
is not the first language of either spouse (Comrie 1989: 146). Furthermore, since 
the late 1930s schooling has mainly been in Russian, which has in many cases led 
to a complete break in transmission of the native language. Last but not least, 
speakers of minority languages were frequently encouraged, more or less officially, 
to give up their language for a bigger language, often Russian (Comrie 1989: 148; 
Kibrik 1991: 10). 

It therefore comes as no surprise that the indigenous languages of Siberia show 
marked Russian influence. All of them exhibit a large number of lexical copies' 
from Russian, with phonological differences depending on the time of copying. 
In the early, pre-revolutionary period of contact, relatively few items were copied 
into the indigenous languages; these were predominantly designations of novel 
cultural items such as “bread” or “tea” and were adapted to the phonological 
system of the recipient language. During the Soviet era, on the other hand, a large 
number of Russian copies entered the indigenous languages, mostly without any 
phonological adaptation (Comrie 1996: 36; Kaksin 1999: 221-2; Nevskaya 2000: 
285; Malchukov 2003: 237; Mati¢ 2008: 103-4; Grenoble 2000: 106). 

In addition to importing a large number of lexical items from Russian, the indigen- 
ous languages of Siberia have also undergone structural changes that can be traced 
to Russian influence. Thus, a shift can be observed in the use of some cases, for 
example the use of the instrumental instead of the dative case to mark the overt 
agent of passive constructions in Evenki and Khakas (Gladkova 1991: 68; 
Grenoble 2000: 109; Anderson 2005: 172), the use of the dative instead of the 
allative to mark the addressee of verbs of speech in Evenki* (Gladkova 1991: 68, 
Grenoble 2000: 109), as well as the development of dative case-marked experi- 
encer subjects, and the extension of the dative case to mark direct objects in Ket 
(which lacks the accusative case used for this purpose in Russian; Minayeva 2003: 
48, 50-1). 

The most salient structural changes undergone by Siberian languages in con- 
tact with Russian are in the domain of syntax. Thus, a shift toward a less strict 
verb-final word order has been noted in some Tungusic languages (Malchukov 
2003: 241; Grenoble 2000: 107-8; Gladkova 1991: 68), in Nivkh (Gruzdeva 2000: 
125-6), and in Khakas (Anderson 2005: 222). Instead of the previously widespread 
use of parataxis, coordinate sentences joined with conjunctions copied from 
Russian have been documented in Samoyedic languages (Batori 1980: 144) and 
in Evenki (Grenoble 2000: 115). Finite subordinate clause constructions copied from 
Russian are increasingly replacing the indigenous use of case-marked participles 
or converbs (cf. Anderson 2004: 69-72), for example in the Tungusic languages 
(Malchukov 2003: 241; Grenoble 2000: 116-18; Gladkova 1991: 68), in Yukaghir 
(Mati¢ 2008: 117-19), in Shor (Nevskaya 2000: 286), in Khakas (Anderson 2005: 
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196-221), and in Enets (Sorokina 1991: 66-7; Khanina & Shluinsky 2008: 71-3). 
These copied constructions make use of indigenous adverbials as complementizers 
or conjunctions, but use of conjunctions and complementizers copied from 
Russian has been documented as well (1a). The formation of relative clauses with 
the use of interrogative pronouns as relativizers (1b) has been described for Evenki 
(Malchukov 2003: 241) and for Khakas (Anderson 2005: 205-9). Interestingly, Forest 
Enets appears to be developing finite relative clauses not with an interrogative 
pronoun, but with a demonstrative functioning as relativizer (Khanina & 
Shluinsky 2008: 70-1).° 


(1) a. Yukaghir (Mati¢ 2008: ex. 33; taken from Nikolaeva 2004: 29.49) 
jesli Germanija kejdej-te-j [.. .] taynugi er-ce 
if Germany advance-FUT-INTR.3SG then —bad-ATTR 
modol o:-te-j 
life © COP-FUT-INTR.3SG 
‘If Germany wins [...] then life will be bad...’ 


cf: Russian 

Jesli Germanija pobedit [...] Zizn’ budet ploxoj 

if Germany win.FUT.3SG life be.FUT.3SG bad.INS.F 
‘If Germany wins, life will be bad.’ 


cf: uninfluenced Yukaghir (Mati¢ 2008: ex. 31; taken from Nikolaeva 
2004: 37.4-5) 

touke Cuge l’e-de-jne [.. .] odul—yin gon—te-jek 

dog trace COP-3-DS.COND.CVB Yukaghir-DAT go-FUT-2SG.INTR 
‘If there are dog traces there [...] you will marry a Yukaghir.’ 


b. Evenki (Malchukov 2003: ex. 6b) 
i-le hurkeken suru-re-n gorot-tu... 
where-ALL boy go-NFUT-3SG town-DAT 
‘In the town where the boy is going...’ 


cf: Russian 

v gorode, kuda idét mal’Cik... 
in town.PREP where.ALL go.PRS.3SG boy 

‘In the town where the boy is going... .’ 


cf: uninfluenced Evenki (Malchukov 2003: ex. 6a) 
hurkeken suru—mecin—du-n gorot—tu... 
boy go—-FUTPT-DAT-POSS.3SG town—DAT 

‘In the town where the boy is going... 


Note that not only the use of conjunctions and relative pronouns has been copied 
from Russian, but so has the use of finite verbs in subordinate clauses. Thus, the 
impact of Russian on the languages of Siberia is leading to a gradual typological 
shift. 


722 Brigitte Pakendorf 


3 Pidgins and Mixed Languages in Siberia 


Only two Russian-based pidgins have been recorded in Siberia: Chinese Pidgin 
Russian (also known as Siberian pidgin, Kjakhta pidgin, or the Majmachin speech) 
spoken previously in the Chinese—Russian border town of Kjakhta as well as along 
the Lower Amur, and Taimyr Pidgin Russian (also known as Govorka) spoken 
on the Taimyr Peninsula (Wurm 1992: 252, 259; Perexval’skaja 2006: 13). In addi- 
tion, in the nineteenth century a number of trade jargons may have existed in 
Chukotka involving Chukchi, Eskimo, and English, which were used for com- 
munication between Chukchi and Eskimos, as well as with sailors of whaling or 
expedition ships (de Reuse 1996: 58). The scarcity of pidgins in Siberia as com- 
pared to other colonies can be explained by the fact that the Russians did not 
relocate individuals from different ethnolinguistic groups for purposes of forced 
labor, so that there was no occasion for a system of interethnic communication to 
arise spontaneously (Stern 2005b: 289). And by the time people were resettled in 
linguistically mixed villages in the mid twentieth century, access to standard Russian 
as a lingua franca was ensured through obligatory schooling in Russian. 

Chinese Pidgin Russian was initially the language used by Chinese and 
Russian traders in the trading towns of Kjakhta and Majmachin from the early 
eighteenth to the early twentieth century’ (Sprincin 1968: 87; Wurm 1992: 259). 
A derivative of this pidgin was also spoken in Harbin (Sprincin 1968: 98-9; 
Wurm 1992: 263), and it later spread to the Lower Amur region, where it played 
a role in the development of the pidginized Russian spoken by local Tungusic 
peoples (Nichols 1980: 397; Khasanova 2000: 182, 193). Chinese—Russian pidgin 
is characterized by large-scale insertion of epenthetic vowels to maintain the CV 
syllable structure characteristic of Chinese, loss of case-marking, loss of agreement, 
and a complete lack of inflection on verbs, which instead are used in the Russian 
imperative form. Optional tense-marking is achieved through the postposition of 
tense forms of the Russian verb ‘to be’, i.e. esi/esa (for present tense, but occur- 
ring with future and past meanings as well), bylo (for past tense) and budu (for 
future tense) (Sprincin 1968: 92, 96-7; Nichols 1980: 401; Wurm 1992: 260-2). The 
lexicon is mainly of Russian origin, although the Harbin variant of this pidgin 
contains rather more words of Chinese origin (Sprincin 1968: 98-9). 

In contrast to Chinese Pidgin Russian, which developed as a trade language, 
Taimyr Pidgin Russian was predominantly developed and used for interethnic 
communication by Dolgans and Nganasans. Russians were probably not directly 
involved in the development of this pidgin, since they were not in direct contact 
with the Nganasans (Stern 2005b: 290-1). Nowadays the lexicon of Taimyr 
Pidgin Russian consists mainly of Russian words; however, previously there may 
have been a large number of lexical items from Dolgan (Ubrjatova 1985: 68). This 
pidgin is characterized by a lack of case-marking, with one predominant post- 
position mesto ‘place’ marking non-core arguments. A sociative marker meste 
(derived from Russian vmeste ‘with’) also exists; often mesto and meste are used 
interchangeably. A further postposition toroba (derived from Russian storona 
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‘side’) marks location (Stern 2005b: 301). In contrast to Kjakhta Chinese—Russian 
pidgin, in which verbs are uninflected, Taimyr Pidgin Russian shows some ver- 
bal inflection. Even in the “basilectal” system, which has been less influenced by 
standard Russian, verbs take person-marking; however, there is no strict agree- 
ment with the subject. Rather, the third person singular and first person plural 
forms predominate, while second person singular forms are rarely used (Stern 
2005b: 309). Another difference between the two pidgins is that in Taimyr Pidgin 
Russian personal pronouns are based on the Russian genitive-accusative forms, 
while in Chinese-Russian pidgin they derive from Russian possessive pronouns. 
Some of the salient differences between the two pidgins are illustrated in the 
following examples. 


(2) a. Chinese—Russian pidgin (Wurm 1992: 263) 
za moja Nikita skazyvaj budu kako Dalaj pogovori esa 
for 1SG[POSS] N. __ tell[IMP] will[1SG] how D. _ talk[IMP] is 
‘T will tell Nikita how Dalaj (i.e. addressee) is speaking.’ 
b. (Sprincin 1968: 94) 
sobuka nizu Zivi 
hill under live[IMP] 
‘T live at the foot of the hill.” (sobuka < Russian sopka) 


(3) a. Taimyr Pidgin Russian (Stern 2005b: ex. 56) 
taperja menja budem = Samanit’ 
now 1SG[ACC] will[1PL] act.as.shaman[INF] 
‘Now I will act as shaman.’ 
b. (Stern 2005b: ex. 10) 
utrom nganasan tut baba — mesto govorit 
in.the.morning Nganasan here woman place say[PRS.3SG] 
‘On the following morning that Nganasan says to his wife.’ 


Only one contact language in Siberia emerged as the result of relocation of peo- 
ples for labor purposes: Copper Island Aleut (CIA). This mixed language with a 
predominantly Aleut lexicon is characterized by Aleut noun inflection, derivational 
morphology, and nonfinite verb inflection, but by Russian finite verb morpho- 
logy and pronouns (Thomason 1997: 457, 460). It arose on Copper Island, one of 
the Commander Islands off the coast of Kamchatka, which was uninhabited when 
discovered in 1741. In 1826 the Russian-American Company settled Aleuts on the 
Commander Islands to work in the seal-slaughtering trade along with Russian 
employees. A population called “creoles”® arose at an early stage of the island’s 
settlement out of the union of Aleut women and Russian men (Thomason 1997: 
451). These creoles were a socially and economically distinct group — they had a 
different legal status from and were better off economically than the Aleuts, 
but were looked down upon socially by both the Russians and the Aleuts since 
they were of illegitimate birth, at least in the early period (Thomason 1997: 
453-4). 
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Like other Aleut dialects, CIA has only two cases (absolutive and relative), pos- 
sessive suffixes, singular, dual and plural number on nouns, and no gender dis- 
tinctions. It has two sets of pronouns, derived from Aleut and Russian, which are 
used in distinct constructions: The Aleut pronouns are restricted to reflexive verbs, 
while Russian pronouns occur as subject markers, and in their accusative form 
have replaced the original Aleut objective conjugation of the verb (Golovko 1996: 
70-71). The most notable difference between CIA and other Aleut dialects is the 
system of finite verbal inflection, which in CIA derives entirely from Russian. 
In the present tense, verbs take Russian portmanteau suffixes for each person— 
number combination; in contrast to the nominal system, a dual number is lacking 
for verbs. In the past tense, the Russian past tense marker —/ is used (Thomason 
1997: 458-9). The following examples demonstrate the use of Russian pronouns 
and finite verb markers in CIA (4a, 5a) in comparison with Bering Island Aleut 
(4b, 5b). 


(4) a. Copper Island Aleut (Golovko 1996: ex. 18) 


ona hikta-it cto ona ego ilaxXta-it 
3SG.NOM.F say—PRS.3SG that 35G.NOM.F 3SG.ACC.M love-PRS.3SG 
Russian: ona govor-it cto ona ego Ljub-it 


3SG.NOM.F say—PRS.3SG that 35G.NOM.F 3SG.ACC.M love—PRS.3SG 
‘she says that she loves him.’ 
b. Bering Island Aleut (Golovko 1996: ex. 19) 
ilaXta-ku—u 
love-REAL-3SG.OBJ.3SG.SBJ 
‘s/he loves him/her /it.’ 


(5) a. Copper Island Aleut (Golovko 1996: ex. 20) 


ty menja hamayaaxta-is 
2SG.NOM 1SG.ACC ask-PRS.2SG 
Russian: ty menja —_ sprasiva—es 


2SG.NOM 1SG.ACC ask—PRS.2SG 
“You ask me.’ 
b. Bering Island Aleut (Golovko 1996: ex. 20) 
ting ahmayaakta-ku-Xt 
1SG.OBJ ask—-REAL-PRS.2SG 
“You are asking me.’ 


CIA must have arisen between the period of initial settlement of Copper Island 
in 1826 and approximately 1900. It most probably arose before the demise of the 
Russian-American company in 1867, which led to the departure of most of the 
Russians from the Commander Islands and to the end of the special social and 
legal status of the creoles (Thomason 1997: 461, 465). This mixed language must 
therefore have arisen in a very short time, in at most two generations. It prob- 
ably did not arise as a pidgin, because neither the Aleut nor the Russian com- 
ponent is simplified. Not much is known about the use of Aleut and Russian on 
Copper Island in the early years of its settlement; however, the creole population 
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was probably fluent in both languages, and it may well be that the long-term Russian 
settlers knew Aleut (Thomason 1997: 462-3). The most likely explanation for the 
development of CIA is that it arose in a setting of bilingual code-switching, with 
some “creative decisions” by the speakers themselves as to what form the final 
product would take (Thomason 1997: 464-5; Golovko 2003: 190-8). In this, CIA 
differs from Taimyr Pidgin Russian and Chinese—Russian pidgin, which arose as 
a means of communication in the absence of a common language between the 
groups in contact. 


4 Language Contact among the Indigenous 
Languages: The Influence Exerted by Evenki 


Notwithstanding the vast geographic expanses and low levels of settlement, the 
indigenous peoples of Siberia have been in contact with each other over the course 
of centuries, as demonstrated by contact-induced changes in their languages. 
A well-described case is the influence of Chukchi on neighboring Eskimo lan- 
guages (de Reuse 1994), while lexical copying among different languages has been 
documented over the whole geographical area (cf. Anderson 2004: 21-4 for a brief 
overview and further references). The role played by language contact in shap- 
ing linguistic diversity in Siberia will be further exemplified by three brief case 
studies involving the Northern Tungusic language Evenki as the model. 

Evenki consists of a large number of dialects that are spoken over a vast area 
of Eurasia, from the Ob-Yenisey watershed in the west to the coast of the Sea of 
Okhotsk in the east, and from the fringes of the Taimyr Peninsula in the north to 
the Baikal region and the sources of the Amur in the south (Atknine 1997: 110; 
cf. Figure 35.1). Evenks were traditionally highly mobile nomadic hunters who 
used domesticated reindeer for transport, who were and are in contact with 
speakers of very many different languages. Therefore, Anderson (2006: 294) sug- 
gests that they may have played the role of “vectors of diffusion” of at least some 
of the features that characterize the Siberian macro-area, although the spread of 
the Northern Tungusic languages over the vast area they occupy today may have 
taken place quite recently, not more than 600 or 700 years ago (Janhunen 1996: 171). 


4.1 Evenki influence on the Buryat converbal system 


As with other Tungusic languages, Evenki has an elaborate system of converbs 
with diverse semantic and syntactic properties that function in syntactic reference 
tracking. The converbs differ with respect to their syntactic distribution: Some con- 
verbs, called same-subject (SS) converbs, can occur only in subordinate clauses 
with a subject coreferential with that of the main clause. Some converbs, called 
different-subject (DS) converbs, can occur only in subordinate clauses whose 
subject is non-coreferential with that of the main clause; and a third group of 
converbs, called variable-subject (VS) converbs, can occur both in subordinate 
clauses with a coreferential and in clauses with a non-coreferential subject 
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(Nedjalkov 1995: 445). SS converbs do not take any person agreement markers, 
with the exception of the plural suffix -! (6a). The DS and VS converbs, on the 
other hand, obligatorily agree in person and number with the subject of the 
subordinate clause. This is accomplished by the use of possessive suffixes when 
the subordinate subject is non-coreferential with the main clause subject (6b), and 
by the use of reflexive possessive suffixes when they are coreferential (i.e. with 
VS converbs; 6c). The following examples illustrate the use of the SS temporal 
converb (6a) and the difference in person-marking between the coreferential and 
non-coreferential use of the VS simultaneous converb (6b, c): 


(6) a. Evenki (Nedjalkov 1995: ex. 7, 8a, 8b) 

c&u—la—ver eme—mi-l a&ep—co—tin 
house-—LOC-—PREFL.PL come-TEMP.CVB-PL eat—PST-3PL 
‘Having come home they ate.’ 

b. Turu-du — bi-yesi-n tara—ve sa—Ca-v 
Tura—DAT be-SIM.CVB-POSS.3SG that-DEF.ACC know-—PST-POSS.1SG 
‘I knew that when s/he was/lived in Tura.’ 

ce. Turu-du — bi-nesi—vi tara—ve sa—Ca—v 
Tura—DAT be-SIM.CVB-PREFL that-DEF.ACC know-—PST-—POSS.1SG 
‘I knew that when I was/lived in Tura.’ 


In most Mongolic languages, not even finite verbs take subject agreement markers 
(Sanzeev 1964: 82, 83-4), let alone converbs. An exception, however, is Buryat, 
in which the converbal system functions in a manner very similar to that in Evenki. 
Thus, the converbs occurring only or predominantly in SS constructions do not 
take person marking (Skribnik 1988: 143; 2003: 117; 7a), with the exception of the 
modal converb, which can take reflexive possessive suffixes (Skribnik 2003: 116, 
table 5.8). The remaining converbs take possessive subject-agreement markers 
when they occur in subordinate clauses with a non-coreferential subject (7b), or 
reflexive possessive person markers when the subjects are coreferential (Poppe 
1960: 70; Skribnik 1988: 149; 7c). The conditional and abtemporal converbs, however, 
remain unmarked in SS constructions even though they can also occur in non- 
coreferential clauses, where they take possessive suffixes (Skribnik 1988: 152). 


(7) a. Buryat (Skribnik 2003: pp. 116-17) 
tedener—te 6xibii:d—irn’ tuhal-xaja: —_ jere-xei 
those-DAT children—POSS.3PL help—FIN.CVB come-RES.PTCP 
‘Their children have come to them in order to help.’ 

b. tende xiire-Ze oSo—tor—nai dain balda:n 
there reach-IPF.CVB go-TERM.CVB-POSS.1PL enemy.OBL ? 
durha-xa —_johotoi 
end-FUTPT probably 
‘By the time we get there, the war will surely be over.’ 

c. Butedmar teren-tji tani-xalarr—ar bajarla—sa—ba 
B. that.OBL-ACC recognize-SUCC.CVB-PREFL be.glad-INTS-TERM 
‘Recognizing him, Butedmaa was glad.’ 
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It is thus clear that the Buryat system is not quite as regular as that found in Evenki. 
In Evenki there is a strict correlation between syntactic function and the type of 
agreement marking, with SS converbs taking no agreement suffixes, DS converbs 
always taking possessive suffixes, and VS converbs taking either possessive suf- 
fixes in non-coreferential clauses, or reflexive possessive suffixes in coreferential 
clauses. In contrast, in Buryat the modal converb can take reflexive possessive 
suffixes, even though it occurs predominantly in clauses with coreferential subjects 
and can thus be counted among the SS converbs. In addition, the conditional and 
abtemporal converbs remain unmarked in coreferential clauses, even though 
they can be classified as VS converbs. However, notwithstanding the slight irregu- 
larities found in the Buryat converbal system, the similarity to Evenki is striking. 
The same type of subject agreement suffixes fulfil the same syntactic role in both 
languages. 

Buryat did not inherit this system from its Mongolic ancestor, indicating 
that it was either innovated independently, or that it developed under contact 
influence. The arguments in favor of contact influence are quite solid: Firstly, 
the converbal system of Buryat and its function in syntactic reference tracking 
parallels the Evenki system and its functions. Secondly, the Evenki system was 
clearly inherited from its Tungusic ancestor, since syntactic reference tracking 
with the help of person-marked converbs is found in other Tungusic languages 
(albeit with different converbal suffixes). Lastly, speakers of Buryat have been and 
still are in close contact with speakers of Evenki. Thus, the conclusion that in this 
instance Evenki influenced Buryat is quite straightforward. This is most probably 
due to language shift from Evenks to Buryat, as documented by the presence of 
a number of Buryat clan names that are of Evenk origin, as well as by phono- 
logical changes in Buryat that can be traced to Evenki influence (Cydendambaev 
1981; CimitdorZieva 2004). 


4.2 Evenki influence on the development of the Sakha 
and Dolgan partitive case 


Evenki is characterized by having two case suffixes to mark direct objects: the 
definite accusative suffix -vA/—mA is used in the majority of instances (8a), while 
the indefinite accusative case suffix —(y)A is only used to mark clearly indefinite 
direct objects (8b), objects that have not yet been made, or partially affected mass 
nouns (8c; Nedjalkov 1997: 147, 192-3). Furthermore, the indefinite accusative 
case is restricted to the future indicative and the imperative mood, and to use 
with habitual verbs, while the definite accusative case occurs with all past tenses 
(Nedjalkov 1997: 194) 


(8) a. Evenki (Nedjalkov 1997: ex. 782a, b, 786) 
oron—mo cdava-kal 
reindeer—DEF.ACC take—PRXIMP.2SG 
‘Catch that (definite) reindeer.’ 
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b. oron-o dava-kal 
reindeer-INDF.ACC take—PRXIMP.2SG 
‘Catch yourself a/any reindeer.’ 
c. min-du — ulle-je kolobo-jo bu:-kel 
1SG-DAT meat-INDF.ACC bread-INDF.ACC give—PRXIMP.2SG 
‘Give me (some) meat and (some) bread.’ 


In the languages of Siberia, direct object marking varies widely from language to 
language. However, there are two Siberian languages that make a similar distinction 
in the case marking of direct objects to that found in Evenki: these are the closely 
related Turkic languages Sakha (Yakut) and Dolgan. Sakha is spoken by a group 
of cattle- and horse pastoralists who immigrated to the Lena river from an area 
to the south roughly during the thirteenth/fourteenth centuries. Dolgan is spoken 
on the Taimyr Peninsula by a group of reindeer herders. The origins of this group 
are not yet well established, but a language shift to Sakha by Evenks is assumed 
to have been involved. 

In Sakha and Dolgan, in the indicative and conditional mood definite and spe- 
cific indefinite direct objects are marked by the accusative case, while generic 
indefinite direct objects remain in the unmarked nominative case. In the affirma- 
tive imperative mood, however, whereas definite direct objects take accusative 
case marking (9a), indefinite direct objects as well as partially affected mass nouns 
take the so-called partitive case (9b, c; Artem’ev 1999: 107; Pakendorf 2007: 142-6). 


(9) a. Sakha (Pakendorf, 2002 field data) 

mieye bu yara att tut-an bier 
1SG.DAT this black horse-ACC hold—PF.CVB BEN[PRXIMP.2SG] 
‘Catch this black horse for me.’ 

b. (Pakendorf 2007: ex. 30b) 
mieye —_ at-ta tut—an bier 
1SG.DAT horse-PART hold—PF.CVB BEN[PRXIMP.2SG] 
‘Catch me a horse.’ 

c. (Pakendorf 2007: ex. 29b) 
halamat-ta huorat—ta amsay—iy 
salamat-PART yoghurt-PART taste[PRXIMP]—2PL 
‘Try some salamat (Sakha dish), some yoghurt.’ 


Neither Evenki nor Sakha inherited this indefinite accusative/partitive case 
from its respective ancestor, and therefore the direction of contact influence can 
at first glance not be easily determined. However, since the Evenki indefinite 
accusative occurs in more environments than the Sakha partitive, and has further 
functions, it is more probable that Evenki influence led to the development of the 
indefinite accusative meaning in Sakha than the other way round (cf. Pakendorf 
2007: 167-73). 

Evenki influence on the development of the Dolgan partitive case is more 
easily established, since the Dolgan partitive has developed a further function in 
parallel with the Evenki indefinite accusative case. This is additionally used as a 
designative case, in which benefactive and direct object functions are collapsed. 


Contact and Siberian Languages 729 


Thus, the indefinite accusative case in Evenki marks direct objects that are 
intended for somebody’s benefit, the beneficiary being marked by obligatory pos- 
sessive suffixes on the case-marked object (10a). This designative function has been 
copied by Dolgan speakers onto their partitive case (10b; Artem’ev 1999: 106). 


(10) a. Evenki (Nedjalkov 1997: ex. 562a) 
aav—ja—v o:-kal 
boat-INDF.ACC-POSS.1SG make-PRXIMP.2SG 
‘Make a boat for me.’ 
b. Dolgan (Ubrjatova 1985: 118) 
h-ani-ka:n minieke bolop—puna oyor 
EMPH-now-EMPH 1S5G.DAT sword—PART.1SG make[PRXIMP.2SG] 
‘Make a sword for me right now!’ 


The development of the Sakha and Dolgan partitive case is not the only instance 
of Evenki influence on the structure of these Turkic languages. Similar influence 
can be shown for the loss of the genitive case, the retention of the distinction between 
the comitative and instrumental case, the functions of the possessive markers, as 
well as for the development of the future imperative, as described in section 4.3. 
Interestingly, language shift of entire groups of Evenks to Sakha appears improb- 
able in light of genetic evidence, although some intermarriage of Sakha with Evenk 
women cannot be excluded. On the other hand, Y-chromosomal analyses indi- 
cate that only a small group of Sakha paternal ancestors settled on the Lena river 
500-1,300 years ago (Pakendorf et al. 2006). It it is thus possible that the small 
group of immigrating Sakha pastoralists were dependent on the indigenous 
Evenks, at least until they had adapted to the new environment. This might have 
led to a degree of bilingualism of Sakha-speakers in Evenki, which might explain 
the contact-induced changes in the absence of shift (Pakendorf 2007: 317-23). 


4.3 The distinction between present and future 
imperative 


Evenki makes a morphological distinction between a present imperative and a 
future imperative. The latter form expresses commands that may be fulfilled at 
a later point in time, e.g.: 


(11) a. Evenki (Nedjalkov 1997: 19) 
d&u-la-vi himat eme-kel 
house-LOC-PREFL fast come—PRXIMP.2SG 
‘Come quickly to my place.’ 
b. &u-la—vi (gocin) eme—de:—vi 
house-LOC-PREFL (next.year) come-DSTIMP-PREFL.SG 
‘Come to my place (next year).’ 


Both the present and the future imperative are found for all person—number com- 
binations in Evenki. The marker for the present imperative is restricted to this 
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function, with portmanteau suffixes expressing both mood and person/number. 
The future imperative paradigm, on the other hand, is split, with the first and 
third persons taking different markers from the second person for both mood and 
agreement. In the second person, the future imperative marker is identical to the 
purposive converb suffix, and agreement is achieved by the reflexive possessive 
suffixes. 

A distinction between commands that are to be fulfilled immediately and com- 
mands that may be fulfilled at a later point in time is quite rare among the Siberian 
languages, as it is worldwide (cf. Pakendorf 2007: 226-32; Gusev 2005: 62). In addi- 
tion to Evenki, it is found in the closely related Northern Tungusic languages Even 
and Negidal, which also use the purposive converb suffix plus reflexive posses- 
sive suffixes for the second person future imperative. A future imperative is fur- 
thermore found in one branch of the Southern Tungusic languages (in Nanay, 
Orok, and Uléa), where it is restricted to the second person. However, the future 
imperative marker in these languages differs from that found in the Northern 
Tungusic languages, being dedicated to this function. Furthermore, a distinction 
between present and future imperative is made in Nganasan, in Dolgan and Sakha 
(12a, 12b), in Yukaghir (13a, 13b), and in the Mongolic languages Buryat and Dagur. 
All of these languages are currently or were historically in contact with Evenki 
or Even, and they are all the sole members of their respective language families 
to make such a distinction. 


(12) a. Sakha (Pakendorf 2002 field data) 

siidhii-giin kepsez 
livestock—ACC.2SG tell[PRXIMP.2SG] 
‘Tell about your livestock!’ 

b. (Pakendorf 2007: ex. 67c) 
bu tiiliippiién-tinen kepse—-t—e:r die—n 
this telephone-INS_tell-CAUS—DSTIMP[2SG] say—PF.CVB 
‘Tell me (later) by telephone’, he said.’ 


(13) a. Kolyma Yukaghir (Maslova 2003b: ex. 338a) 

tet jaqte-k kejien 
2SG sing[PRXIMP]-2 at.the.beginning 
‘sing first!’ 

b. (Maslova 2003b: ex. 339a) 
tet colhoro kudedée lek-telle jagte-ge-k 
2SG hare liver eat-SS.PF.CVB sing—DSTIMP-2 
‘Eat some hare liver and then sing!’ 


The distribution of the present/future imperative distinction among the lan- 
guages of Siberia is strongly indicative of contact influence, with the Northern 
Tungusic languages as the source. There are three arguments in favor of this 
conclusion: First, none of the languages not belonging to the Tungusic language 
family could have inherited the distinction between a present and a future 
imperative from their linguistic ancestors. This implies that either all of these 
different languages innovated the future imperative independently of each other, 
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or that all developed it under contact influence — a rather more plausible assump- 
tion. Second, a distinction between a present and future imperative is found in 
two different branches of the Tungusic family, indicating that it may well be an 
inherited feature in the Northern Tungusic languages Evenki and Even. Third, 
Evenki and Even are in contact with all of the non-Tungusic languages making 
a distinction between a present and future imperative. This indicates that the direc- 
tion of influence was probably from Evenki and/or Even to the other languages. 

However, none of the languages that has developed a future imperative copied 
the marker directly from Evenki. In Sakha (and Dolgan), the future imperative 
grammaticalized out of a previous analytical imperative construction (Pakendorf 
2007: 237-41), while in Kolyma Yukaghir, the present imperative is unmarked in 
the second person, and the future imperative (which is restricted to the second 
person) is marked by the same suffix -ge that expresses the present imperative in 
the first and third person (Maslova 2003b: 140). The Buryat future imperative has 
developed through an extension of meaning of an imperative form that in other 
Mongolic languages expresses a polite imperative (Poppe 1960: 60; Skribnik 2003: 113). 

The most direct evidence for Northern Tungusic contact comes from the 
Mongolic language Dagur, which has long been spoken in contact with Solon Evenki 
in Inner Mongolia. Dagur developed a so-called “indirect imperative” with a mean- 
ing of delayed action and politeness, e.g. yau-garm-—miny [go—PURP-POSS.1SG] 
‘T will go later; let me go later!’. The suffix used for this future imperative is the 
purposive converb, and, as in purposive constructions, it can take reflexive pos- 
sessive suffixes as agreement markers for the second person (Tsumagari 2003: 143-4, 
146). The use of the purposive converb with the reflexive possessive suffix as a 
future imperative marker is clearly a copy of the future imperative construction 
found in Evenki, as described above, making the conclusion of its contact- 
induced origin quite straightforward (cf. Tsumagari 2003: 144). However, in con- 
trast to Evenki, in Dagur the future imperative uses the purposive converb plus 
possessive suffixes for all person-number combinations. 


5 Conclusions 


This brief sketch of language contact influences in the vast area of Siberia has illus- 
trated that contact situations can be multi-layered. Currently ongoing changes in 
the languages of Siberia are due to the influence of Russian and, in certain areas, 
of Sakha, both of which are politically dominant; unfortunately, this dominance 
is leading to a large-scale shift to Russian, and occasionally to Sakha. In addition 
to the influence exerted by politically dominant languages, over the centuries the 
indigenous languages have been undergoing changes brought about by contact 
with their neighbors. Unfortunately, not much is known about the prehistoric 
contact between the indigenous peoples of Siberia, making it difficult to draw con- 
clusions from these changes. In some cases, they are probably due to substrate 
influence resulting from language shift, as in the case of Evenki influence in Buryat. 
Whether in other cases contact influence may be due to long-term multilingualism 
is hard to establish for certain. However, in the example of Sakha—Evenki contact, 


732 Brigitte Pakendorf 


previous bilingualism of Sakha speakers in Evenki is a possibility. More studies 
involving both fine-scaled molecular anthropological and linguistic analyses of 
contact in Siberia are therefore necessary to elucidate how these languages changed 
under different kinds of contact. Finally, it has become clear that the copies made 
by the recipient languages are not always identical to the model: the Buryat con- 
verbal system shows some deviations from the strictly functional person-marking 
found in Evenki, while Dagur extended the use of the purposive converb as a 
future imperative marker to all persons, whereas in Evenki this is restricted to 
the second person. This demonstrates that copied elements can undergo language- 
specific changes after their incorporation into the recipient language, resulting in 
a lack of identity between the model and the copy (cf. Johanson 1992: 175). 


NOTES 


I thank Rebecca Carl for drawing the map of Siberia, and Bernard Comrie, Katharina Gernet, 

Markus Lang, Dejan Mati¢, and Rolf Pakendorf for constructive criticism of a draft of this 

chapter. Obviously, any remaining flaws are entirely my responsibility. 

1 Given the diverse meanings the word “borrowing” has in the literature on language 
contact, I prefer to speak of “copying” (cf. Johanson 1992: 175). 

2 This development has also been attributed to Sakha influence (Malchukov 2006: 127). 

3 Abbreviations used in this chapter are as follows: 


ACC accusative M masculine 

ALL allative NFUT non-future 

ATIR attributive NOM nominative 

BEN benefactive OBJ object 

CAUS causative OBL oblique 

COND conditional PART partitive 

COP copula PF perfective 

CVB converb PL plural 

DAT dative POSS possessive 

DEF definite PREFL reflexive possessive 
DS different subject PREP prepositional case 
DSTIMP future imperative PRS present 

EMPH emphatic PRXIMP present imperative 
F feminine PST past 

FIN final PTCP participle 

FUT future REAL realis 

FUTPT future participle RES resultative 

IMP imperative SBJ subject 

INDF indefinite SG singular 

INF infinitive SIM simultaneous 

INS instrumental Ss same subject 
INTR intransitive SUCC successive 

INTS intensive TEMP temporal 

IPF imperfective TERM terminative 

LOC locative 
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4 Stern (2005a: 178), however, suggests that this pidgin may have arisen as late as the 
turn of the eighteenth and nineteenth centuries. 
5 Note that the term “creole” referred only to the peoples’ mixed ancestry; Copper Island 


Aleut is not a creole, but a mixed language. 
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36 Language Contact in 
South Asia 


HAROLD F. SCHIFFMAN 


Language contact has been a topic of interest in South Asian linguistics almost 
since the beginning of European involvement in the study of the area. The earli- 
est researchers concluded that the strong influence of Sanskrit and other Indo- 
Aryan languages on the other languages of the area seemed to imply that all of 
the languages of “India” were related, and as the discipline of historical linguistics 
developed, that all of the languages were therefore descended from Sanskrit or 
from sort of Proto-Indo-Aryan. Gradually, this misconception was done away 
with, but the existence of contact phenomena remained obvious, and continues 
to be an area of interest to this day, especially as the reverse influences, i.e. of 
“other” language groups and their influence on Sanskrit and other Indo-Aryan 
languages, became more obvious, and it also became clear that those “other” groups 
were not historically descended from Sanskrit/Indo-Aryan. The intensive give- 
and-take nature of the contact is often referred to as the notion of “India as a 
linguistic area” or, to use the German term, a Sprachbund. 

The main kinds of evidence for intensive borrowing and cross-family influ- 
ences are the following: 


1 The presence of retroflex consonants that contrast with alveolar or dental con- 
sonants, so that most languages have a system of stops with five points of 
articulation, along with voiced variants and homorganic nasals, and in some 
cases, aspirated as well as unaspirated stops. 

2 Use of lexical verbs as markers of aspect (sometimes known as “vector 
verbs”) which not only provide aspectual distinctions (such as “completive” 
versus “durative”) but also may be used to convey “attitudinal” or “speaker- 
centered” distinctions, such as whether the verbal action is done on one’s own 
behalf or in somebody else’s interest, whether the speaker approves of or is 
critical of the action, whether the action is intentional or accidental, etc. Over 
time, these aspectual verbs have become grammaticalized, i.e. recruited into 
the morphological system, resulting in meaning differences between the 
aspectual verb and its lexical source. In the process, the aspectual marker may 
become phonologically reduced. 
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3 Stative verbs tend to have dative subjects, i.e. impersonal constructions such 
as “To-me knowledge exists” versus “I know (something),” or “To-me it is 
liked” versus “T like it.” 

4 Word order is typically SOV, meaning that the languages have postpositions 
instead of prepositions, while adjectives, genitives, and even relative clauses 
are embedded before nouns, since they cannot in most cases follow the 
sentence-final verb. 

5 Nouns have fairly elaborate case systems, but postpositions can proliferate to 
the extent that the distinction between the two may not be clearly discernible. 

6 Reduplication, usually with a fixed consonant + vowel in the reduplicated syl- 
lable replacing the first consonant + vowel in the basic word, e.g. Tamil puli 
kili for ‘tigers and other things’. This process may also appear in verbs, e.g. 
Tamil pooyttu kiittu ‘going and other activities’ (‘coming and going’). 

7 Pidginization and creolization can result, not only through contact of indigen- 
ous languages with colonial languages, but internally via contact between 
more prestigious languages and those of lesser, but indigenous status. In South 
Asia the importance of English has led to a whole subfield of the study of 
South Asian English; but Creole Portuguese continues to survive in isolated 
pockets, and other indigenous creoles such as Bazaar Hindi, Naga Pidgin, 
and Vedda Creole also provide challenges to the theory of pidginization. 
Suggestions that creolization has affected languages not usually thought to 
have a pidgin or creole origin have also provided challenges to standard the- 
ories of linguistic “stammbaum” relatedness. 

8 Use of the verb ‘say’ as an embedding device, as a verbalizer of onomotopoeic 
expressions, and in other grammaticalization situations. Related to this is 
the widespread areal use of onomatopoetic devices, especially reduplicated, 
together with the verb ‘say’, or as a verb stem (Emeneau 1969). Since this 
pattern in Indo-Aryan is not inherited from Indo-European, it is hard to deny 
that its origin is from Dravidian or Munda; the proliferation of these devices 
in Dravidian, with phonological patterns that are often at variance with 
“normal” sound patterns, is a mystery that has no easy answer. 

9 The languages of the area are often referred to as “agglutinative” since 
grammatical morphemes tend to have clear-cut boundaries; suffixes prevail 
over prefixes. But there are regions of dominance within the area, e.g. the 
Dravidian languages eschew prefixes almost entirely, whereas prefixation in 
Sanskrit and languages “descended” from Sanskrit remains a possibility in 
the Indo-Aryan area; retroflexion is “stronger” in the Dravidian languages, 
but reduces gradually as one moves north, even within the Dravidian area, 
transitioning through Marathi, which for example, retains a retroflex lateral, 
into areas of North India which do not. 


1 Review of Literature 


Although it is customary to devote some space in an article like this to a review 
of literature, in effect this whole chapter is a review of other scholars’ work on 
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various topics that impinge on the subject of language contact.’ To go back into 
history a bit, various European scholars engaged primarily in historical studies 
of Indo-European noted the existence of some of the area-wide features in the 
early nineteenth century. Caldwell (1856), whose intent was to demonstrate that 
the Dravidian languages were not related to Indo-Aryan, but constituted a 
separate family, attributed the widespread existence of retroflex consonants to 
Dravidian origin; Bloch (1934 [1965]) provided the first inventory of widely 
shared features. But it is Emeneau (1956) who first posited the notion of “India 
as a linguistic area” on the lines of other linguistic areas such as the Caucasus, 
West-Africa, the Pacific Northwest, the Balkans, and others. Andronov (1964) 
expanded the scope of the proposed inventory by suggesting a number of pos- 
sibilities that were perhaps not widespread in the area, but had regional or sub- 
regional frequency; he also proposed that the convergence that is observable in 
South Asia might go so far as to eradicate known genetic relationships, resulting 
eventually in new language families. Few other scholars have been willing to go 
so far as to accept this somewhat neo-Marrian proposal, but the study of sub- 
regional areal features is an interesting and useful activity that might reveal deeper 
generalities. 

The question of which family of South Asian languages is to be seen as the donor 
or originator of these features is one that a number of scholars have addressed, 
but the reluctance to attribute them to Dravidian among Indo-Aryanists is hard 
to see as anything but a kind of cultural superiority complex — the idea that 
Indo-Aryan languages, and especially Sanskrit, could borrow features from less 
prestigious languages is hard for some to swallow, even when the evidence is 
well-nigh inescapable. Bloch, for example, unable to stomach the idea of a 
Dravidian origin of certain features, is forced to contemplate Munda languages 
as a source (Bloch 1924: 20). Emeneau, on the other hand, sees the Dravidian 
family as the source for most of the areal features he has dealt with, so a kind of 
linguistic politics, which we usually witness fought out in the streets as well as 
in courts and legislative chambers, unfortunately also finds itself manifested in 
scholarship. 

Another interesting approach to the whole issue is that of Masica (1976), which 
attempts to map the borders or widest extension of certain features, in order to 
see whether they extend beyond South Asia, or are confined to the general area, 
or even to a smaller portion of the general area. Masica concentrates on certain 
syntactic features of South Asian languages, such as the relative positioning of 
subjects, objects, and verbs in surface structures. He wishes to see if their distri- 
bution has borders similar to the isoglosses established in dialectology studies, 
and whether these isoglosses “bundle” at the borders of the area, or diffuse gradu- 
ally. An earlier study along similar lines is Ramanujan and Masica 1969, which 
looked at the distribution of phonological features and their possible isoglossic 
boundaries, and found that retroflexion, so typical of the area, was more strongly 
distributed in the Dravidian area, tapering off as it went north, so that Marathi, 
for example, possesses a retroflex lateral phoneme like the Dravidian languages 
do, but further north, this sound disappears, even if d@vanagari and other scripts 
possess a glyph for it. 
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2 Creoles and Pidgins 


As noted above, the issue of pidginization and creolization can be divided on the 
one hand into varieties that arise through contact with languages coming from 
outside the area, especially through colonialism, and varieties that arise through 
internal contact among languages already indigenous to the area. For the former, 
it is English and its influence on local South Asian languages that gets the lion’s 
share of the attention, and the main source of attention in this field has been through 
the work of Kachru (1965; 1966; 1969), but of course English was not the first 
European language to have an impact on South Asian languages — Portuguese 
preceded it, and the Portuguese use of the pidgin (Sabir) that was already in 
existence in the Mediterranean, and was used in early contact with Africa was 
useful in this early contact. It therefore was employed as a lingua franca by suc- 
cessive colonizers, so that it has retained salience beyond its early beginnings. 
Because pidgin languages were considered “corrupt, broken” and beneath con- 
tempt, attention was not paid to them until fairly late, so the earliest work on 
Indo-Portuguese was not until Dalgado early in the twentieth century (1900; 1906; 
1917) and there was then little work until that of da Fonseca in the middle part 
of the century (1959). But other, indigenous pidgins and/or creoles have received 
less attention, despite obvious phenomena such as the role of Persian in the devel- 
opment of Urdu, the possibility that Marathi might owe some of its structure to 
Dravidian influence (Southworth 1971; 2005), and the less secret cases of Vedda 
Creole and Naga pidgin (and/or creole) developments. 

One of the crucial issues regarding the development of creoles and pidgins in 
South Asia is the extent to which universals of language come into play, and whether 
there are “universal” developments that happen, no matter what the donor lan- 
guage(s) and substratum language(s) are. This issue is known within the study 
of creoles and pidgins as the “monogenetic” versus the “polygenetic” hypothesis. 
The monogeneticists, as might be expected, believe that all creoles and pidgins 
in the world can trace their origins back to one pidgin, one that developed in the 
Mediterranean during the middle ages, and which eventually came to be known 
as Sabir. This pidgin then became the vehicle used by the Portuguese in their first 
explorations, and in the process gradually became more strongly influenced by 
Portuguese as a result; eventually this pidgin Portuguese reached South Asia and, 
as we have noted, was used by all colonial powers in their early dealings with 
South Asia, such that Portuguese words penetrated all the languages it came into 
contact with. As a result, lexical items for realia that South Asians did not pos- 
sess, such as “window” or “table” can often be traced back to the Portuguese item. 
Thus the Tamil for ‘window’ is jannal, from Pt. janela, and ‘table’ is meese, from 
Pt. mesa. Later borrowings, of course, would be from English, such as karant (from 
English ‘current’) for ‘electricity’. 

Things get problematical for the monogenetic hypothesis, however, when the 
two languages in contact, especially the powerful one that donates most of the 
lexical items, do not involve Portuguese or another European language. We do 
have examples of such in South Asia, such as the Naga Pidgin spoken in 
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Nagaland, which borrows lexica mainly from Assamese, Vedda Creole in Sri Lanka, 
which borrows mainly from Sinhala, and Sri Lanka Malay (Smith & Paauw 2006; 
Ansaldo 2008) and perhaps others. In general in pidgin situations, the lexical items 
are contributed by the politically more powerful language, e.g. the colonial 
power, whereas the underlying structure comes from the colonialized; but some- 
times so-called “universal” features can indeed be distinguished, although how 
to sort this out seems to sometimes depend on the politics of one’s linguistic 
theory. 

Another issue for South Asia is whether pidginization and/or creolization has 
gone on at a much earlier stage, but is now not easy to determine, because of 
what is known as “relexification,” a process that often happens in pidginization 
and/or creolization situations, such that previously borrowed vocabulary is 
replaced by lexica from other languages. This vocabulary then masks the deeper 
history of the borrowing process. Thus a language that might have once borrowed 
massively from a donor language, but then “cleans up its act” as it were, by 
relexifying from a more “respectable” source, such as from Sanskrit, might be 
in fact a creolized language, but this would be difficult to detect. A hybrid term 
“creoloid” is sometimes used for such situations, meaning that the language in 
question is similar to a creole, but one cannot be completely sure. In many such 
cases, the guardians of the language in question do not wish to admit that 
creolization had anything to do with the history of their language, so the “ques- 
tionable past” of the language is covered over, and the issue is not discussed. 
Such a situation just described has been posited by Southworth (1971; 2005) for 
Marathi, but the credence scholars might lend to the notion that Marathi could 
have its origins in a creole heavily influenced by Dravidian has not been forth- 
coming among the speakers of that language. It is easy to replace lexical items 
from another source, but less easy to replace grammatical morphemes that may 
have been borrowed, of course. The fact that many Indo-Aryan languages now 
possess postpositions, which earlier Indo-Aryan did not, and that the form of the 
Hindi dative/accusative postposition is -ko, which strongly resembles the dative 
-kku in Tamil, is typical of the kind of data that lead some scholars to believe that 
Indo-Aryan acquired its postpositions from Dravidian. 


2.1 Indian English 


Though the influence of indigenous languages on the kind of English spoken and 
written in South Asia has long been noted, and differs little from the influence 
of indigenous languages on other imported languages, the importance of English 
in the subcontinent has meant that the study of “Indian English” has become a 
separate subfield in the scheme of things. While earlier work focused on perceived 
negative aspects of this influence, i.e. how English was “incompletely learned,” 
later studies have focused on the fact that English is in fact the mother tongue 
of many South Asians, and needs to be viewed as a variety of English in its 
own right, not a corrupt version of the “real” language as spoken by British, 
Australian, or North American “native” speakers. The work of Kachru (1965; 1966; 
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1969) is foundational in this area. For more recent approaches see Tickoo (1996) 
and Ramanathan (2005); in my own research on the role of English (Schiffman 
2005a) I find that it is not viewed as a “foe” by non-Hindi-speaking groups, since 
it is perceived to act as a protective shield against the invasion of Hindi into inner 
domains of the languages, especially Tamil. 


2.2. More recent research 


Research interest in the topics mentioned heretofore has in more recent years 
focused on a smaller set of these topics, in particular on the creole and pidgin 
hypothesis, and on the effects of grammatical influence of some languages on 
others. The latter focus now views grammatical borrowing as part of the larger 
topic of grammaticalization; for more on this see below. Another way the topic 
has become “expanded” is to include discourse as a kind of areal feature, as in 
the work of Moag and Poletto (1991). 


2.3 Naga Pidgin (Nagamese) 


As an example of more recent research on pidgins and creoles, there is 
Bhattacharya’s (1994) study of Naga pidgin, which is also known as Nagamese, 
and his conclusion that this pidgin has now undergone creolization, at least to 
some extent, perhaps more noticeably in urban areas. Bhattacharya reviews vari- 
ous claims that were made by earlier research, such as Sreedhar (1974), especially 
the claims that Nagamese might be characterized as a post-creole continuum, or 
a creoloid, meaning a language that superficially resembles a creole, but does not 
seem to have undergone creolization (or pidginization), and/or decreolization. 
He also looks at the claim that it is imperfectly learned Assamese or Bengali, but 
concludes that it is an expanded pidgin which is starting to creolize in some parts 
of Nagaland, and is already a creole (but not a creoloid) in Dimapur. The paper 
examines all these hypotheses in detail and gives reasons for the conclusions offered, 
the most important of which is undoubtedly the claim that Nagamese is now 
the mother tongue of at least some native speakers, even if it is not for many 
other users. 


2.4 Other creoles 


Ian Smith has made a specialization of working on South Asian creoles, with his 
work on Sri Lanka Portuguese and Sri Lanka Malay. In an overview article on 
the pidgins and creoles of South Asia, Smith remarks 


South Asian pidgins and creoles do not feature prominently in the general literature 
on pidgins and creoles, yet they have some important contributions to make to our 
understanding of these phenomena. First, South Asian contact languages in general 
demonstrate the importance of substrate languages as opposed to rather specific 
“universals,” largely because they have formed through contact between a lexifier 


744 Harold F. Schiffman 


and either a single substrate language or a number of structurally similar substrate 
languages. For example, Mihlhausler enumerates nine “synctactic properties that have 
figured prominently in recent discussion of pidgin universals” (1986: 154.ff.), but 
[...] neither Bazaar Hindi nor Nagamese confirms to the expected profile. Both 
languages are extended pidgins, however, and it may be argued that features such 
as (7) as well as possibly (4) in Nagamese were earlier present but have been 
obliterated by the expansion process, just as (5) is in the process of being lost in 
Nagamese. (Smith 2008: 14) 


Smith also investigates whether and how the linguistic outcomes of pidginiza- 
tion and (especially) creolization differ from the products of the convergence that 
takes place in situations of longstanding language contact. In Smith (2001) and 
unpublished conference papers, he compares two languages influenced by Tamil 
(among other languages), Sri Lanka Portuguese and Sourashtra, an Indo-Aryan 
language akin to Gujarati whose speakers left North India over a millennium ago. 
Both languages have undergone heavy Dravidianization, and Smith argues that 


Data such as these appear to blur the distinction between convergence and creolization: 
both languages exhibit similar types of structural “change” and it is difficult to attribute 
any particular structural result of contact to the social circumstances. There are 
certainly more differences in between Sou[rashtra] and Ta[mil] than between Ta. 
and SLP [Sri Lanka Portuguese], but this would suggest that the results of the 
two situations differ only in quantity rather than in quality. (Smith 1994, quoted in 
Smith 2001) 


This finding is in keeping with some recent work in creolistics, which argues 
that creolization is not greatly different from normal language change (e.g. 
Mufwene 2001). 

But Ansaldo (2008) takes issue with Smith, Paauw, and Hussainmiya (2004) on 
the issue of whether Sri Lanka Malay exhibits more influence from Tamil than 
from Sinhala. 


2.5  Kupwarization 


Another interesting product of language contact that is probably unique to South 
Asia is a study of a situation where three languages — Urdu, Marathi, and 
Kannada — have been in such close contact in one village that they have converged 
in many ways, sharing many grammatical and syntactic features in common but 
still remaining lexically distinct. The reason for maintaining distinct “languages” 
in this village is attributed to religious differences in the three communities — the 
Marathi speakers belong to one kind of Hindu sect, the Kannada speakers to another, 
while the Urdu speakers are of course Muslim. This study, by Gumperz and Wilson 
(1971), focusing on the pseudonymous village of “Kupwar,” has become so 
renowned that it has spawned the term “kupwarization” to refer to this kind of 
language contact phenomenon, whose features have been detected in other con- 
tact situations elsewhere in the world. The question of creolization, kupwarization, 
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and other ways in which languages seem to influence each other raises challenges 
to the general theory of pidginization and creolization, since we seem to find cre- 
ole languages, or languages that exhibit the features we attribute to creolization, 
that have not passed through the supposedly requisite stages of pidginization, 
but still exhibit features that cannot be explained by other theories. South Asian 
examples of these also defy some other orthodoxies, since they do not necessar- 
ily exhibit the morphological simplicity found in other creole situations, do not 
necessarily result in SVO word-order, nor do they have the lexical basis of “uni- 
versal” (i.e. Mediterranean) creoles such as Sabir. 


2.6 Portuguese Creole in Daman and Diu 


Though Portuguese creoles appear to have declined in speakership in recent years 
as the younger generations of speakers abandon them in favor of other dominant 
local languages, there are a few areas where they persist, such as in the enclaves 
of Daman and Diu, where recent work has been carried out by Cardoso (2005). 


3 Grammaticalization 


Though earlier studies of language contact tended to focus mainly on lexical and 
phonological influences, and therefore to doubt the possibility that one language 
could affect another language’s grammatical structure, renewed interest in the 
topic of grammaticalization has independently brought a revived focus on how 
languages change grammatically, and how this might be an areal feature. One 
should note that though American linguistics in recent years has focused on syn- 
tax, European scholars such as Meillet and younger generations of scholars never 
lost interest in morphological change, and “kept the topic alive” until it was finally 
“rediscovered” (so to speak) in recent years. 


Whereas analogy may renew forms in detail, usually leaving the overall plan of the 
system untouched, the “grammaticalization” of certain words creates new forms and 
introduces categories which had no linguistic expression. It changes the system as 
a whole. (Meillet 1964 [1922]: 133). 


Given the similarity of certain kinds of changes in language families with inde- 
pendent histories, such as the Dravidian and Indo-Aryan groups, one must 
conclude that there is something going on, though it is sometimes difficult to deter- 
mine which language might be the donor, and which the recipient. But as Meillet 
points out, whatever the result, we are dealing with radical changes, not some 
superficial syntactic or stylistic variants. This is particularly evident when com- 
parisons are made between older versions of diglossic languages, such as “stan- 
dard” Tamil, and newer colloquial variants. When we examine the differences 
between these closely, we find that Spoken Tamil has changed radically, and in 
particular through the grammaticalization of what were previously lexical verbs, 
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but which must now be considered morphological markers, i.e. grammatical 
morphemes. This is particularly obvious when it comes to aspectual and modal 
distinctions, and the widespread similarity found between the Tamil or other 
Dravidian systems and Indo-Aryan systems are striking indeed. 

Though some scholars continue to resist this notion of “radical change” in the 
structure of these verbal systems, recent work by European scholars such as Heine 
and Kuteva (2003), Maisak (1999), and others has come out more emphatically in 
favor of a clear connection between language contact and grammaticalization, 
citing Dahl (2000), who points out that 


grammaticalization processes tend to cluster not only genetically but also areally, 
and [that] the terms areal grammaticalization (Kuteva 2000) and grammaticalization 
area... have been proposed to describe the effects of grammaticalization processes 
on the areal patterning of linguistic structures. (Dahl 2000: 317) 


How the two are interrelated is not clear, and it will not be the goal of this paper 
to try to decide whether there is cause-and-effect; but the data from South Asia 
attest strongly to areal patterning of various structures, i.e. it is not just borrow- 
ing of lexical items or phonological material, but grammatical patterns, syntactic 
patterns, and combinations of these (Heine & Kuteva 2003: 530). 


3.1 Grammaticalization in Dravidian 


In my own work on Tamil I was driven to this conclusion by a roundabout route 
that focused originally on some irregular phonological changes, an analysis of which 
led me to the realization that what had occurred was in fact morphologization, 
and that what had started as independent (lexical) verbs ended as grammatical 
markers of aspect (Schiffman 1993). The phonological evidence showed that 
certain intervocalic consonants were only deleted word-internally, but not across 
word boundaries; this meant that lexical items that had originally been treated 
as separate lexical verbs had to be analyzed as having become word-internal 
morphemes, because word-initial consonants that are only deleted word-internally 
in Spoken Tamil were being deleted in these items. This forced me to conclude 
that grammaticalization had taken place, converting the lexical items to aspectual 
suffixes of the verb. 

Another issue that is peculiar to South Asian languages is that there is often 
an “attitudinal” nuance contributed by certain verbs; this is consonant with what 
grammaticalization theory foregrounds about the role of the speaker in the 
development of new grammatical devices — the need to recruit a lexical item to 
express a “speaker-centered” issue overrides the availability of older grammatical 
devices. That is, speakers may become dissatisfied with the grammatical devices 
available to them, so they reach for new ways to state the point of view (and the 
word “aspect” of course, with its origin in Lat. ad-spectare, does focus on the point 
of view) they wish to express. Thus whether the language is Indo-Aryan or 
Dravidian, we get certain verbs being used, not only to express “completion” or 
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“perfectiveness,” but also to express whether an action was intentional or accidental. 
This requires a judgment on the part of the speaker — again, speaker-centered 
priorities are foregrounded.’ 


3.2 Definitional problems 


In the literature on South Asian languages, this issue is plagued by definitional 
quarrels, i.e. is the phenomenon that all the languages exhibit to be referred to 
as “compound verbs,” “vector verbs,” “Aktionsart,” “aspectual verbs,” “aspect 
markers,” “verbal extension,” “multiple periphrastic perfectives” or something else 
entirely? These definitional quandaries have their origins in the fact that in some 
of the languages, in particular the Indo-Aryan languages, the evolution of the verbs 
having these “special” uses has not proceeded so completely to grammaticaliza- 
tion, as Hook (1991) shows for Marathi as compared with Hindi. But for the 
Dravidian languages, at least those that I have examined, grammaticalization of 
some but not all of these verbs seems to have in fact been completed, such that 
some of them have complete freedom to co-occur with any verb whatsoever in 
the language, some of them have almost totally lost their lexical analogs, phono- 
logical reduction is radical, and the aspectual nuances are very obvious. One exam- 
ple of this “radical” phonological reduction would be the Literary Tamil ‘verb’ 
kol, whose adverbial participle form is kontu, which is reduced in extremis in Spoken 
Tamil to kittu or just -ttu, as in (2) below. 

Thus the Tamil aspectual verb kol, which had the original lexical meaning 
‘contain’ now has almost no lexical residue, but instead has multiple aspectual 
meanings. When contrasted with the use of iru ‘be (located)’ it indicates that an 
action is completed to the benefit of someone, whereas iru indicates continuity, 
but no completion. The illocutionary force of the first example, therefore, is 
sarcastic: 


Wu Wu 


(1) a. tamir enge  katt-irukkiinga 
Tamil where study-duration=INS=PNG 
b. tamiR enge  kattu-kittiinga 
Tamil where study=COMPL/BENEF=TNS=PNG 


Both of these sentences literally mean ‘Where did you learn Tamil?’, but (1a) 
has sarcastic illocutionary force: ‘Where (the hell) did you learn Tamil?’ (e., ‘you 
don’t know Tamil’) whereas example (1b) which also literally means ‘Where did 
you learn Tamil?’ uses ko] as its aspect marker, and therefore has an obligatory 
implicature of actual completed acquisition of the language, so therefore lacks the 
implicature of sarcasm. 

Another interesting example of the multiplicity of meanings contributed by the 
aspectual markers (i.e. completive plus benefactive plus something else) is given 
by Annamalai (1985) in which exactly the same structure is used to mark the notion 
that an action affected the actor in some (usually beneficial) way, but with an addi- 
tional nuance that must be interpreted in different and contradictory ways. In these 
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Tamil examples, the aspectual auxiliary ko] expresses in one of the sentences the 
notion that an action was intentional, and in the other, that it was accidental: 


(2) a. raamasaami mudi-ye  vett i-kitt aan 
Ramaswamy hair-ACC cut- BENEF-PAST-PNG 
‘Ramasamy cut his hair (on purpose).’ 

b. raamasaami  kayy-e vett i-kitt aan 
Ramaswamy hand-ACC cut- BENEF-PAST-PNG 
‘Ramasamy cut his hand (by accident).’ 


In example (2a), the action of cutting is considered to be intentional, because that 
is what getting one’s hair cut is usually considered to be — one does not accidentally 
sit down in the barber chair. But in example (2b), the cutting of the hand is taken 
to be accidental because sane people do not usually deliberately cut themselves 
in the hand. These contrasting pairs therefore illustrate another facet of the com- 
plexity of these processes, namely, that the meaning of the aspect marker (or what- 
ever we choose to call it) may vary depending on the main verb it collocates with. 

A basic assumption of grammaticalization theory is that when a language recruits 
lexical items to be used to express grammatical categories, it is almost always a 
process that takes centuries to be completed, and some parts of the process (i.e. 
some of the items being grammaticalized) will be more complete than others. The 
theory refers to this as the cline of grammaticalization, meaning that the grammat- 
icalization process is slow and upward-moving, but in the end the process is usu- 
ally completed (see Figure 36.1). Some items, e.g. different aspectual verbs being 
grammaticalized, will be at different points on the cline from others, i.e., one might 
be at point B, just beginning the process, while another might be at point G, the 
end result. And of course the possibility exists that the process might never be 
completed, such that the items recruited remain in a sort of limbo between their 
“purely” lexical meaning, and their “special” use as a grammatical nuance. In Indo- 
Aryan, the debate seems to come down on the side of incomplete grammatical- 
ization, so that the verbs in question, used in many if not all Indo-Aryan 
languages, are referred to as “vector” verbs, but not as “aspectual” markers. In 
Dravidian, the evidence of complete grammaticalization, at least for some of the 
verbs in question, is more convincing, especially when the lexical item that is the 
source of the grammaticalized verb may lose its lexical meaning, or be lexically 
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Figure 36.1 The cline of grammaticalization 
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marginalized. As noted above, this is the case for Tamil kol- and its Kannada 
analog, kollu, which exhibit not only the most complex aspectual meanings of 
any of the aspectual auxiliaries, but have almost ceased to have any lexical usage 
whatsoever. 


3.3 Variability 


This results in a kind of variability that is not sociolinguistic, but grammatical, 
and it also allows for polysemy, i.e. multiple meanings of the verb in question so 
that both its lexical meaning and aspectual meaning can be present, such as in 
Tamil vittudu meaning ‘definitely leave’ with concomitant phonological reduction 
in the grammaticalized item, but not in the lexical item it is derived from. In “stan- 
dard” transformational/ generative grammar, of course, what are most important 
are rules that apply categorically, so something that is variable is anathema to 
the theory, and is dismissed out of hand. This is a basic difference, then, between 
generative attempts to deal with these kinds of developments and more functional 
approaches. The generative school will have nothing to do with such messiness, 
since the degree of grammaticalization of the verbs in question will vary from 
item to item, so categorical attempts to classify something as clearly and 
definitively “aspectual” will fail. In a number of the studies of this phenomenon 
in the Indo-Aryan languages we see researchers attempting to deal with a fixed 
definition of aspect, whereas aspectual systems may vary from language to lan- 
guage and language-family to language-family.’ As Maisak points out (Maisak 
1999), attempting to bind these aspectual meanings to Comrie’s (1976) definition 
of perfective aspect as denoting “a situation viewed in its entirety, without 
regard to internal temporal constituency” (Comrie 1976: 12) is too restrictive; in 
the Dravidian case, “perfective” aspect, expressed with (v)idu might perhaps be 
better defined as “completive” or “definitive” since it can be used imperatively 
to mean ‘be sure to X’, as in: 


(3) naalekkilerundu veelekki vandidunga 
tomorrow-from work-to come-DEF 
‘Come to work starting tomorrow for sure’ 


Thus as Maisak recommends, we should refer to what Indo-Aryanists seem to 
prefer to call “perfective” as “a whole family of meanings comprising perfective 
and a number of other values usually interrelated in languages and linked by paths 
of diachronic development” (Maisak 1999: 1). 


3.4 Metaphor versus metonymy 


Grammaticalization can be seen as a metaphoric or metonymic process, with some 
scholars preferring one term over the other; however, both processes are similar 
insofar as they both often involve the use of a verb of motion, e.g. to express “futur- 
ity” or another verb to express completion, etc. Further, this also helps explain 
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why Indian languages have verbs* that are expressive of various kinds of “atti- 
tude” about the situation, the addressee, or the action, e.g. verbs that lexically 
mean ‘throw’ or ‘cast down’ or ‘drop down’ do have some aspectual meaning, 
but beyond that are expressive about the speech situation. These are usually 
believed to be less completely grammaticalized than the others, and may always 
remain so. But interestingly, they are also verbs of motion, such as common motion 
verbs like ‘go’ or ‘come” but also ‘put’, ‘throw’, ‘let go’, ‘fall’, ‘push/shove’, ‘drop’, 
etc. That is, there is an element of motion involved, even if motion is not the “main” 
meaning. And like motion verbs in other languages that are used for these func- 
tions, they have gone further in the process of grammaticalization than non-motion 
verbs, undergoing “weakening” of their lexical meaning.° The focus on “attitude” 
also comes up in work on discourse as an areal feature (Moag & Poletto 1991). 


3.5 Other kinds of grammaticalization 


Some of these innovations, such as the verbs meaning ‘say’ (Sanskrit iti, Tamil 
en), have taken on the function of marking embeddings or quotations, and are 
now well accepted, although the term “grammaticalization” might not have been 
used. Saxena (1995), for example, builds on this evidence to buttress the theoret- 
ical claim that grammaticalization is unidirectional. In Tamil, we can see that the 
grammaticalization of en ‘say’ has gone beyond this quotative function to take on 
a variety of functions, such as “factive,” the expression of intention, the “obsti- 
nate negative,” and for a myriad of onomatopeic verbs and expressions as well. 
Indeed the Tamil verb en has no pure lexical functions left in the spoken language, 
but is only used as a marker of these functions. (Schiffman 1999: 148-53, 161, 
177-82). As I have tried to show, these unusual verbs of motion that are used to 
express attitudes on the part of the speaker can be seen as based in metaphor, so 
that a verb like poodu, which has the lexical meaning ‘drop, plunk/plop (down)’ 
is used as an aspect marker to convey not only aspectual completeness or 
definiteness, but with a secondary “attitude” nuance of carelessness, or deliber- 
ate malicious intent (Schiffman 2005b): 


(4) neettu varakkuudaadu-nnu sonneenee. aanaa, neettu paattu 
yesterday come-NEG-NECESS-QT said-EMPH. but, yesterday deliberately 
vandu-poottaanga. 
came- MALICE -TENSE-PNG 
‘I told them not to come yesterday, but they deliberately came anyway 
[the jerks!]’ 


As this example shows, poodu is used to express the speaker’s view that there has 
been a careless or even willful or malicious disregard for his or her desires or 
orders. An interesting sidelight is that when translated into English, we have to 
use an expletive or an adjective to describe the “attitude” being expressed, since 
a translation of the original Tamil lexical meaning ‘put, drop, plunk down’ does 
not work, i.e. does not convey the “attitude” in English. 
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3.6 Other issues 


Missionary grammarians’ description of these kinds of verbs, at least in the case 
of Tamil, referred to them as “intensive” (Arden 1942: 282-3), and as Maisak points 
out, this term is still being used to describe at least some of the “multiple 
periphrastic perfectives” (or MPPs) in some languages. Other researchers (such 
as Bybee & Dahl 1989) have used the term “bounders” to refer to these verbs, 
since they often are used to “bind” or limit the action in time or space somehow, 
such as to indicate a turning point or moment of change in the action. Often (as 
in the case of Tamil, for example) a verb meaning ‘go’ is used to indicate a change 
of state, just as in English ‘go’ can occur with other lexical items (e.g. ‘go 
bananas’, ‘go crazy’, ‘go postal’, ‘go belly-up’) to indicate a change of state, espe- 
cially a mental state. Some researchers have claimed that the use of ‘go’ (in Tamil, 
at least) is always negative or undesirable, and indeed there is a preponderance 
of such meanings in the Tamil case. As Maisak points out (Maisak 1999: 15), attempts 
by various researchers to delimit or pin down the meanings provided by these 
verbs is often thwarted by individual idiosyncracies in particular languages, but 
some generalizations can be made. 


3.7 Transitivity 


One such generalization is that aspectual verbs that are transitive in their ori- 
ginal lexical meaning can usually only be paired with transitive main verbs, but 
again, this generalization may be countered in individual cases, such as in the 
vandu-poottaanga example given above, since the main verb “vaa” is indeed 
intransitive, but the aspectual verb poodu is transitive in its original lexical form. 
In Tamil in general, it seems that the more completely grammaticalized an aspec- 
tual verb is, the wider its possible distribution, so that (v)idu, the commonest com- 
pletive verb, may occur with any verb in the language, while the less-completely 
grammaticalized aspect marker vai, which in its lexical form means ‘put, place’ 
and in its aspectual usage give a notion of doing something ‘for future utility’, 
usually only co-occurs with transitive main verbs, but with some exceptions. 
Annamalai (1985) gives at least one example with an intransitive verb: 


(5) Dairektar oru jook sonnaar; naan siriccu vecceen 
Director a joke said; I laugh FUTUTIL-PAST-1SG 
‘The Director told a joke, and I laughed [dutifully, just in case.]’’ 


This example raises the question of whether there are “degrees” of transitivity, 
or whether there are other semantic issues involved here. The English verb 
“laugh” is clearly not a transitive verb, but deliberately laughing at a joke, e.g. 
to curry favor with an employer, seems different from doing so spontaneously, 
with no ulterior motive. That is, the intent of the laughter is to affect the outcome 
of something, which is somehow closer to being transitive than spontaneous 
laughter. 
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Details such as these tend to make it difficult to generalize about these verbs, 
but one desideratum of the issue of language contact in South Asia is surely 
more intense scrutiny of these phenomena. Most of the literature on the subject 
of “vector verbs” seems to assume that what is the case for Indo-Aryan must also 
be true for Dravidian, when in fact fine differences may thwart such attempts to 
generalize about the data. As Maisak puts it, 


It is thus a problem of semantic description of periphrastic forms; it is not generally 
accepted that their perfectivizing function is basic, other additional meanings being 
just “nuances.” However, we would like to note again that there is nothing unusual 
in the situation when a marker which is not fully grammaticalized has some specific 
use connected with its original semantics. As for completive markers, it has been 
shown in cross-linguistic studies that additional semantic nuances are particularly 
characteristic of them... (Maisak 1999: 16) 


Here I would disagree with Maisak about the acceptance of the notion that 
perfectivizing is basic — in the Dravidian’ situation, it seems to me that most of 
the “attitudinal” aspect markers do in fact have a basic component of “com- 
pletive” or “definiteness” (“perfectivization”). But I agree that these less-fully- 
grammaticalized verbs do retain some of their original lexical meaning, and this 
meaning is metaphorical when used aspectually. Again, in the example above, the 
“carelessness” associated with the lexical meaning of poodu ‘drop, plop, plunk’ 
becomes one of “inconsiderateness” or “malicious intent” (Schiffman 1999: 101) 
when used as an aspect marker.’ 

I would also disagree with Maisak’s conclusion that these derivational perfec- 
tives are somehow “marginal,” i.e. not true grammatical markers: 


[A]lthough they [MPPs] play an important role in expressing perfectivity, they can 
by no means be treated as “genuine” grammatical markers. Like “derivational per- 
fectivization” by means of prefixes or suffixes, bounder perfectivization by means 
of auxiliaries does not create proper inflectional categories, but rather “grammatical- 
ized lexical categories,” as Dahl (1985: 89) put it. In this respect, MPPs stay, so to 
say, somewhere between lexical verbal compounds and “true” (grammaticalized) 
periphrastic forms. (Maisak 1999: 20). 


As mentioned above, my own work on certain phonological processes in Tamil 
(Schiffman 1993) that seemed to operate only word-internally led me to conclude 
that the earlier situation in Tamil where these verbs could still be considered “spe- 
cial” lexical verbs was no longer valid, and that these verbs had become gram- 
maticalized (or were “on the road” to grammaticalization). The fact that Literary 
Tamil writes them as separate verbs leads some researchers to take that as the 
status quo, and to ignore the phonological reduction these forms have undergone 
in Spoken Tamil, which researchers on grammaticalization note is usually a clear 
indicator that grammaticalization is in progress. Perhaps Maisak is basing his claim 
on the wide inventory of languages he investigates in his paper, but as far as 
I can see, the Dravidian languages I am familiar with have gone further in this 
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process than Indo-Aryan, at least in the spoken variants of them. Here again, diglos- 
sia may hide certain phenomena, and mask the true nature of the situation. 


3.8 Possible further topics 


One issue that needs to be dealt with in an article on South Asia areal linguistics 
is how Sinhala, a language separated widely from other Indo-Aryan languages, 
has been strongly influenced by Tamil. This topic has been extensively dealt with 
by Gair (Gair 1985; 1998 [1976]), who has pointed out such things as the lack of 
aspirated consonants and the existence of long as well as short mid-vowels in 
colloquial Sinhala, as well as some syntactic features that strongly suggest a 
Dravidian origin, or at least influence. Given the other interesting Sprachbund 
topics that Sri Lanka displays (Vedda Creole, Sri Lanka Malay, Indo-Portuguese) 
it seems that Sri Lanka is in fact a microcosm of the whole South Asian linguistic 
area, calling for more attention than it is possible to devote to it here. 


NOTES 


1 This review relies heavily on material presented in two chapters in Shapiro and 
Schiffman 1981, ie. ch. 4 “South Asia as a Linguistic Area” and ch. 7, “Pidginization, 
Creolization, and South Asian English.” 

2 Recall that the interest in including discourse as a possible focus of areal linguistics 
also foregrounds the importance of attitude, as in the study by Moag and Poletto (1991). 

3 My own dissertation on Tamil aspectual verbs (Schiffman 1969) was an attempt to describe 
it within the generative paradigm prevalent at the time, but must be now classified as 
a failure, since the “Aspects” model then current would not tolerate variability. As one 
of my dissertation advisers put it, “Rules is rules.” 

4 In many languages the world over, the kinds of verbs recruited to be grammaticalized 
are often common verbs of motion, such as ‘go’, ‘come’, ‘fall’ as well as ‘have’, ‘be’ etc. 

5 For a list of these, see Schiffman 1999. 

6 As Maisak points out, in Chinese they lose tone and accent as they become grammat- 
icalized (Maisak 1999: 8). 

7 This example is cited from Schiffman 1999: 87. 

8 At least this is true in the languages I know best, Kannada and Tamil. 


9 Maisak quotes Andronov 1987 as referring to this as “the extravagant” or “irre- 
versible” nature of the situation. 
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37 Language Contact 
and Chinese 


STEPHEN MATTHEWS 


Chinese offers numerous case studies of language contact. This review will focus 
on the following topics: 


1 the role of contact in the making of the Chinese language family; 

2 substrate influence of non-Sinitic languages on the development of local 
forms of Chinese; 

3 mutual influences between Mandarin and other local dialects; 

4 substrate influence of Chinese languages in contact languages such as 
Singapore Colloquial English and Hawaiian Creole English; 

5 Asian languages influenced by Chinese; 

6 lexical and structural borrowing from Chinese in languages of Southeast Asia; 

7 code-mixing involving Chinese. 


The term “Chinese” calls for some clarification at the outset. In a general sense, 
Chinese covers any of the languages and dialects belonging to the Sinitic language 
family and standing in a certain relationship to the logographic Chinese script. 
The Chinese “dialects” (also known as Sinitic languages, cf. Chappell 2001a) are 
about as diverse as the various Romance languages such as Italian, Portuguese, 
and Romanian: Cantonese, and Mandarin, for example, are about as divergent 
(and mutually incomprehensible) as French and Italian. Commonly, unless other- 
wise specified, “Chinese” refers to the standard language, often without dis- 
tinction between written and spoken forms. The term “Mandarin” is used either 
to refer to the standard language, or to delineate a group of dialects spoken natively 
in northern and western parts of China. To specify the standard language as taught 
in schools, the term Putonghua (literally ‘common speech’) may be used. 
Putonghua serves as a lingua franca across the PRC, especially in areas where 
Mandarin dialects are not used natively (Escure 1998). 
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1 The Making of Chinese 


The development of Chinese has been compared to the evolution of the Romance 
languages from Latin (Norman 1988: 187). In each case, an expanding empire spread 
a common language across much of a continent, beginning some two thousand 
years ago. The spread of the language of empire was at the expense of a diverse 
range of local languages. As their speakers adopted the spreading dominant 
language, these local languages played the role of substrates in the development 
of local varieties of it —- the Romance languages and the “dialects” of Chinese 
respectively. 

Just as Latin and its descendants have influenced other languages of Europe 
through structural and lexical borrowing, so Chinese has cast its influence across 
East Asia. Those languages not swallowed up by the spread of Chinese have 
borrowed from it more or less extensively: languages within the Chinese sphere 
of influence (termed the “Sinosphere” by Matisoff; see Bradley et al. 2003) 
include Vietnamese, the Tai-Kadai and the Hmong-Mien languages. 


1.1 Contact and genetic relationships 


The mutual influences between Chinese and neighboring languages run so deep 
that it has not always been clear whether these languages bear a genetic relationship 
to Chinese, or one of contact (Matisoff 2001). In particular, the Tai-Kadai and 
Hmong-Mien (Miao-Yao) languages were once assigned genetically to the Sino- 
Tibetan family, and these assumptions are occasionally still followed in mainland 
Chinese works (LaPolla 2001: 227). Outside China, it has become generally 
accepted since Benedict (1972) that the relationship is one of extensive contact and 
mutual influence: on the one hand, the Tai-Kadai and Hmong-Mien languages 
have been substrate languages in the development of local forms of Chinese; on 
the other, they themselves exhibit extensive lexical and structural borrowing from 
Chinese. A clear example involves the numeral system in Tai languages such as 
Thai and Zhuang, which have borrowed virtually all the number terms above “two” 
from Cantonese (see Table 37.1). These data illustrate the principle that cognate 
numerals (with the possible exception of “one” and “two,” which show no such 
correspondence in Table 37.1) are not reliable indicators of genetic relationship, 
since they are easily borrowed. By contrast, areas of basic vocabulary such as 
kinship terms are quite different in Tai and Chinese, reflecting their distinct genetic 
origins. The Tai languages are now seen as related to Austronesian (see Sagart 
2004 for a recent version of this view), while Chinese is clearly related genetically 
to the Tibeto-Burman languages, forming a branch of the Sino-Tibetan family (see 
Thurgood & LaPolla 1999). 


1.2 Substrate influence in Chinese 


The diversity of Chinese dialects (or, to emphasize this diversity, the Sinitic lan- 
guages) is due in part to the various substrate languages spoken in different regions 
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Table 37.1 Numerals in Tai languages and Cantonese* 


Thai Zhuang (Bodomo & Pan 2007) Cantonese 
1 neng ndeu jat 
2 soong song ji/loeng 
3 saam sam saam 
4 sii sei sei 
5 haa ha ng 
6 hok roek luk 
7 cet caet cat 
8 pest bet baat 
9 kaaw giu gau 
10 sip cip sap 


* Numerals are shown in standard romanization systems of each language. Tone 
markings are not shown to avoid confusion between the systems. 


of China. As Chinese armies and settlers moved south, they interacted with a host 
of indigenous peoples collectively known to the Chinese as the “hundred Yue.” 
Although the full range of languages involved cannot be reconstructed, these 
peoples clearly included Tai-Kadai- and Hmong-Mien-speaking populations. 
Evidence can be seen in the toponymy of South China, where place names of 
Tai-Kadai and Hmong-Mien origin can be identified (in Hong Kong, for example, 
the place name Pokfulam can plausibly be interpreted as a Tai toponym meaning 
‘waterfall’). Some characteristic grammatical features of Cantonese can also be traced 
to the Tai substrate (see Matthews 2006b for a review). For example, while 
modifiers precede the head noun in Chinese, Cantonese and Hakka have a small 
number of head-modifier compounds, matching the order in corresponding 
expressions in Zhuang (the most influential of the extant Tai languages): 


(1) Cantonese Hakka Zhuang 
zyul-laa2 _—_‘tsu-ma mou®-me* 
pig-female pig-mother pig-mother 
‘SOW’ ‘sow’ ‘sow’ 


Similarly, in contrast to virtually all adverbs in Chinese which must appear 
before the verb, in Cantonese sin1 ‘first’ and tim1 ‘too’ follow the verb:' 


(2) a. Ngo5S zau2 sini 
I leave first 
‘Tm leaving now.’ 
b. zung6 ho2ji5 jau4seoi2 tim 
stil can swim too 
‘And you can go swimming, too.’ 
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This usage closely matches that of semantically similar adverbs in Tai languages 
(cf. Thai k3on ‘first’ and diay ‘too’), suggesting that this exceptional (and indeed 
emblematic) property of Cantonese is the result of substrate influence (Lucas & 
Xie 1994: 200). The process of language shift continues to this day, with Tai-speaking 
minority peoples such as the Zhuang progressively shifting to Chinese. 

In the north, similarly, non-Chinese languages have left their mark on Mandarin 
dialects. The Yuan dynasty rulers spoke Mongolian, while the Jurchen (rulers of 
the Jin empire from 1115 to 1234) and the Manchus (rulers of China during the 
Qing dynasty, 1644-1911) spoke Tungusic languages. Following the collapse of 
the Qing Dynasty in the first years of the twentieth century, the Manchu-speaking 
population underwent rapid assimilation and shifted to Mandarin. Some prop- 
erties of northern Mandarin such as the distinction between inclusive zanmen and 
exclusive women ‘we’ have been attributed to substrate influence from Manchu 
(Hashimoto 1976). 


2 Areal Typology 


As a result of these diverse influences, a large-scale pattern can be identified along 
a roughly north-south axis in terms of areal typology (Hashimoto 1976): 


Northern China: Fewer tones, more SOV sentences and head-final constructions 
Southern China: More tones, more SVO sentences and more head-initial 
constructions 


Tone systems range from four or occasionally three tones in Mandarin dialects 
to six in Cantonese (nine by some criteria) and seven to eight in southern 
Min dialects such as Hokkien and Chaozhou. In their word-order patterns the 
southern dialects conform fairly closely to an SVO prototype, while the northern 
dialects show more SOV characteristics (including actual SOV sentences, albeit 
as a minority pattern). For example, in Mandarin prepositional phrases encoding 
the semantic role of goal precede the verb, resulting in a verb-final clause: 


(3) Women dao Béijing qu 
IPL to’:_- Bejing go 
‘We go to Beijing.’ 


The southeastern Wu, Gan, and Min dialects represent intermediate types in this 
respect. On an even larger geographical scale, Chinese as a whole is typologically 
intermediate between the head-final SOV structure of the Altaic (Turkic, 
Mongolic, and Tungusic) languages to the north and the head-initial SVO 
structure represented by the Tai languages to the southwest (Comrie 2008). 
Hashimoto’s “Altaicization” hypothesis attributes the head-final characteristics 
of Mandarin Chinese to the influence of the surrounding Altaic languages. 
In support of this hypothesis, ongoing contact with Altaic languages can be seen 
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clearly today in the northwestern provinces such as Qinghai, where Chinese comes 
into contact with Amdo Tibetan and Mongolic languages such as Monguor. An 
example from the Xining dialect shows a postposition apparently borrowed from 
Monguor (Dede 2007: 68): 


(4) {th 73 dt sa HE OK 
ta’ ite? pi'tcid® sa tid’ xuirle 
3SG yesterday Beijing post just return 
‘He just came back from Beijing yesterday.’ 


In extreme cases, SOV word order is adopted together with Altaic case suffixes, 
representing a wholesale typological transformation of Chinese from isolating SVO 
to agglutinating SOV. 


2.1 Areal patterns of grammaticalization 


Another areal feature involves recurrent patterns of grammaticalization. That is, 
across a wide area, functionally parallel grammatical morphemes are transpar- 
ently based on the same lexical items. A case in point is the verb ‘get’ or ‘acquire’ 
as a modal auxiliary, described as an “epidemic” in Enfield (2003) and illustrated 
by dak1 in Cantonese (5a). Another is the ‘surpass’ comparative (Ansaldo, 
forthcoming) as in (5b) from Cantonese, where the verb ‘pass’ comes to express 
a comparison of superiority: 


(5) a. Ngod5-dei6 zau2 dak1 
1-PL leave get 
‘We can leave (we get to leave).’ 
b. Leid faai3 gwo3 ngod 
you fast pass me 
‘You're faster than me.’ 


These are instances of contact-induced grammaticalization as defined by Heine 
and Kuteva (2003; 2005). Cantonese shares both these properties with Tai languages 
(similar patterns also exist in Mandarin, but to a much more limited degree). 

Another areal pattern is the grammaticalization of the verb ‘say’ as a comple- 
mentizer meaning ‘that’ (Chappell 2008). In (6), the verb waa6 ‘say’ serves to 
introduce a complement clause: 


(6) Keoi5 tung4 ngo5 gong2 waab m4 dakihaan6 
s/he with me talk say not available 
‘He told me he that wasn’t free.’ 


These patterns of grammaticalization arguably derive through reanalysis of serial 
verb constructions (Matthews 2006a). 
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3. Mutual Influences between Mandarin and 
Other Chinese Dialects 


Parallel to the bidirectional influences between Chinese and non-Sinitic languages 
are contact relationships between standard written Chinese (together with spoken 
Mandarin) and local dialects. On the one hand, local dialects exert substrate influence 
on the local forms of Mandarin (Escure 1998). For example, the form of Mandarin 
known as Guoyu in Taiwan is strongly influenced by the indigenous Min dialect 
of Taiwan, known as Minnanhua (southern Min speech) or Taiwanese (Kubler 
1985; Cheng 1997). For example, Taiwan Mandarin adopts sentence-final par- 
ticles such as ou from Taiwanese; the verb shuo ‘say’ is used as complementizer 
following the southern Min pattern, similar to the Cantonese example in (6); and 
you ‘have’ is used extensively as an auxiliary verb, like its counterpart u in the 
Min dialects. 

At the same time, homogenization among dialects occurs through diffusion of 
features from the standard language. In colloquial Cantonese, the comparative 
construction is a head-initial one using gwo3 ‘pass’, as illustrated in (5b) above. 
The standard written construction using bi ‘compared to’ (read as bei2 in 
Cantonese) is increasingly used in Cantonese, spreading from formal to less for- 
mal contexts. The resulting construction is head-final, as in (7): 


(7) Leid bei2 ngod faai3 
you compare me fast 
‘You're faster than me.’ 


Another possible outcome of such influence is hybridization (Chappell 2001b). In 
the Xiamen dialect, the Mandarin comparative morpheme bi (written in Min as 
pi due to the different phonological and hence romanization system) is combined 
with the indigenous comparative marker k’ah as in (8): 


(8) I pi gu k’ah  u-le-so 
3SG compare me more polite 
‘She is more polite than me.’ 


Such a combination also occurs in the experiential aspect. The Southern Min dialects 
have a preverbal aspect marker pak derived from the verb ‘know’ (cf. I’ve been 
known to...) while the Mandarin and pan-Chinese marker is postverbal guo 
(derived, like the comparative marker illustrated in (5b) above, from the verb ‘pass’). 
In modern Southern Min usage the preverbal marker and postverbal markers 
typically combine as in (9): 


(9) Gua pak  kikue _ pakkia. 
I EXPER go EXPER Beijing 
‘T’ve been to Beijing.’ 
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In such cases, contact with the standard language results in innovative intermedi- 
ate forms, rather than complete homogenization. 


4 Chinese as Substrate in the Formation of 
Contact Languages 


Chinese has played the role of substrate in a number of contact languages. The 
cases to be discussed are all the result of emigration from China (particularly the 
southern provinces of Guangdong and Fujian). An early example is that of Baba 
Malay, resulting from settlement of traders from Fujian province in Malacca (and 
later also in the Straits Settlements of Penang and Singapore). The “Babas” were 
men speaking Min dialects (principally Hokkien) who intermarried with Malay 
women, the “Nyonyas” (Tan 1988). The language has a layer of Hokkien vocabu- 
lary, including notably kinship and other culturally specific terms such as encik 
‘paternal uncle’. There is also grammatical influence, such as the use of kasi 
‘give’ as a passive marker following the Hokkien model (Ansaldo & Matthews 
1999353) 

More generally, extensive Chinese settlement in Malaysia and Indonesia has 
left its mark on Malay. A case of interest involves the pronouns gua ‘I’ and lu 
‘you’ from Hokkien which are widely used even in speech communities not under 
direct influence from Chinese. This is a counter-example to the generalization that 
pronouns are not normally borrowed, which has been an important assumption 
in the investigation of long-range genetic relationships. 

Chinese Pidgin English (CPE) developed as a prototypical trade pidgin in the 
context of the China trade which flourished in southern Chinese ports in the eigh- 
teenth and nineteenth centuries (see also Ansaldo, this volume). Used between 
European traders and Chinese merchants, it was never the native language of a 
speech community. As would be expected, CPE shows substrate influence from 
Chinese, and Cantonese in particular. An unusual case of substrate influence is 
the use of piece as a noun classifier as in one piecee coolie ‘a worker’. Recent findings 
show stronger Cantonese influence in texts written in Chinese (notably phrasebooks: 
Bolton 2003; Li, Matthews, & Smith 2005). An example involves wh-questions which 
regularly show fronting of the wh-phrase in English-language sources, as in (10a). 
In the Chinese sources, by contrast, wh-in situ questions following the Chinese 
pattern (10b) are commonly found (Ansaldo, Matthews, & Smith, forthcoming): 


(10) a. how muchee you gib (‘how much are you offering?’) 
b. you give what price (‘what price do you give?’) 


Some lexical items introduced into CPE such as taipan (Cantonese daai6 baan1 ‘boss’) 
have remained current in Asian English. Some loan translations such as long time 
no see (Cantonese hou2 loi6 mou5 gin3) also appear to have entered US English 
though pidgin English. 
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In the twentieth century, the development of Singapore Colloquial English (SCE) 
has been profoundly influenced by Chinese, and specifically by the southern dialects 
spoken ancestrally by the majority of the Chinese population: Hokkien, Teochew 
(Chaozhou), Hainainese, Hakka, and Cantonese. For example, questions of the 
form “...or not?” as in (11) are characteristic of the Min (Hokkien and Teochew) 
and Hakka dialects:° 


(11) You got automatic or not? (Gupta 1994: 127) 


Chinese influence on SCE is so extensive that Bao (2005) argues for systematic 
transfer of the entire aspectual system from Chinese to SCE. Again, in order 
for a substrate explanation to be feasible, account must be taken of specifically 
Hokkien patterns such as the preverbal experiential marker pak (as seen in (9) above) 
which appears to underlie the use of ever as in (12):* 


(12) I ever been out with her before. 
‘I’ve been out with her before.’ 


On “Singlish” as a contact language, see also Ansaldo (this volume). 

A more complex case of a contact language showing Chinese influence is 
that of Hawaiian Creole English (HCE, locally still called Pidgin). As plantation 
workers flooded into Hawai'i in the early twentieth century, speakers of Chinese 
(Cantonese and Hakka), Japanese, Portuguese, Tagalog, and other Philippine lan- 
guages communicated in pidgin English, from which HCE developed (Reinecke 
1969). Despite the diverse range of substrate languages involved, specific features 
have been attributed to the Cantonese substrate. For example, Siegel (2000: 212) 
notes that get is used in HCE in existential as well as possessive senses, as seen 
in (13): 


(13) Get wan wahine shi get wan data. 
‘There is a woman who has a daughter.’ 


Here the first occurrence of get has an existential sense (‘there is’) while the sec- 
ond has the possessive sense ‘has’, like the Cantonese verb jau5, which has both 
the existential and possessive senses. Siegel (2000: 214) notes that Portuguese tem 
‘has’ also has the existential sense ‘there is’, which would have reinforced the 
pattern of polysemy. 

Another contact language showing Sinitic influence is Macanese, a Portuguese- 
based creole which developed in the colony of Macau. Closely related to Papia 
Kristang as spoken in Malacca, the substrate languages include Malay as well as 
Cantonese (Ansaldo & Matthews 2004). Although virtually extinct, Macanese left 
its mark on CPE in the form of Portuguese terms such as ladron ‘thief’ and conta 
‘account’ (Li et al. 2005: 103). 
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5 Asian Languages Influenced by Chinese: 
Japanese, Korean, and Vietnamese 


Major languages of East Asia, notably Japanese and Korean, have been profoundly 
influenced by Chinese. Pursuing the parallel between Chinese and Latin invoked 
at the beginning of this chapter, the influence of China on these East Asian 
cultures may be compared to that of Greece on Rome. Especially during the Tang 
period (618-907 CE), Chinese poetry and art as well as Buddhist teaching spread 
to Japan and Korea. Not only did massive lexical and structural borrowing take 
place, but the writing system itself was borrowed to create the first writing sys- 
tems for these languages. The Japanese kanji (logographic characters represent- 
ing lexical items) are directly borrowed from Chinese, while the hiragana and katakana 
syllabaries (used to represent grammatical inflections and loanwords respec- 
tively) were also derived from Chinese characters. Japanese written characters (kanji) 
typically have two readings: a Sino-Japanese reading (based on a loanword from 
Chinese) and a Japanese one (the indigenous term). For example, the Japanese 
character # ‘car’ can be read as sha following the Chinese etymology (Mandarin 
ché) and this pronunciation appears in compounds such as Fifi sharyd ‘vehicle’, 
but Fi meaning ‘car’ is normally read as kuruma, using the indigenous Japanese 
word for ‘vehicle’. 

In contrast to Japanese, Korean and Vietnamese later came to be written alpha- 
betically. Nevertheless, both retain layers of vocabulary from Chinese. In Korean, 
many formal terms such as sénsaeng ‘Sir, Mr.’ are of Chinese origin (cf. Mandarin 
xiansheng, Cantonese sin1saang1). 

Vietnamese, which belongs genetically to the Mon-Khmer branch of the 
Austro-Asiatic family, has been transformed typologically through contact with 
Chinese. Austro-Asiatic languages are thought to have been originally non- 
tonal. Following the Chinese conquest of Vietnam (111 BCE — 938 CE), massive 
lexical borrowing with retention of tonal contours of the loanwords led to the 
development of a lexical tone system in Vietnamese (LaPolla 2001: 227). For 
example, hoa ‘flower’ has a level tone reflecting that of its Chinese source 
(Mandarin hua). 


5.1 Code-mixing in bilingual communities 


Code-mixing between Cantonese and English has long been a feature of the Hong 
Kong linguistic landscape, and has been the object of much research. Some 90 
percent of the population are native speakers of Cantonese, while English is used 
extensively in higher education, the civil service, and commerce. Consequently, 
a substantial proportion of the population consists of fluent bilinguals who habit- 
ually code-mix. Typically, Cantonese can be identified as the matrix language, 
and the structure of mixed utterances respects the grammar of Cantonese. For 
example, Cantonese aspect markers are attached to embedded English verbs, as 
in the dialogue in (14): 
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(14) A: Lei5 confirm-zo2 — go03 itin mei6 aa3? 
you confirm-PERF CL itinerary not.yet SFP 
‘Have you confirmed the itinerary?’ 
B: Mei6, zung6 check-gan2 
not.yet still check-PROG 
‘Not yet, I’m still checking.’ 


The reverse pattern (a Cantonese verb suffixed with -ing) does not occur in speech. 

Cantonese-English code-mixing presents some challenges to putative con- 
straints on code-mixing (Chan 1998; 2003). In general, for example, closed-class 
items such as prepositions are rarely switched, yet English prepositions can be 
embedded in a Cantonese frame as in (15): 


(15) Ngo5-dei6 hai6 under Art Faculty gaa3 
1-Pl be under Arts Faculty SFP 
‘We're under the Faculty of Arts.’ 


Such mixing is characteristic of fluent bilinguals, and reflects the widespread 
use of English in education and business (Gibbons 1987). Motivations for code- 
mixing include convenience, prestige and humour (Li 1996; 1998). Comparable 
patterns of code-mixing are also found between Taiwanese and Mandarin in Taiwan, 
and between Mandarin, English, and other languages in Singapore (Lee 2003). 


6 Chinese Loanwords in English 


Numerous loanwords of a cultural nature have been borrowed into English. Terms 
for Chinese cultural practices include feng shui (‘wind [and] water’, the art of geo- 
mancy), kung fu, and kowtow (from Cantonese kau3 tau4 ‘incline [one’s] head’). 
Many loans are from Cantonese: the cooking utensil wok, for example, can be 
recognized as Cantonese by the final stop. Food terms such as lychee, kumquat, 
dim sum, and wonton all show characteristically Cantonese forms. This reflects the 
role of Hong Kong, where Cantonese has long been the dominant dialect, as a 
meeting point between China and the English-speaking world. 

Loanwords deriving from the China trade include tea (< Hokkien te) from the 
dialect of another treaty port, Amoy (now Xiamen). The variant char is from another 
dialectal form of the same pan-Chinese word (Cantonese caa4 and/or Mandarin 
cha). Varieties of tea are also from various dialects, as in the case of pekoe (from 
Amoy pek ho ‘white down’) and oolong (from Cantonese wu1 lung4 ‘black dragon’, 
Chan & Kwok 1990). 


7 Conclusion 


China in Mandarin is Zhongguo, literally the ‘Middle Kingdom’. Effects of lan- 
guage contact involving Chinese reflect the central place of China in geography 
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and history. As we have seen, the influence of Chinese across East and Southeast 
Asia has been comparable to that of Latin in Europe. Beyond Asia, Chinese influence 
is seen in other contact languages as far afield as Hawaiian Creole. At the same 
time, in the course of its spread across the “Sinosphere” (Bradley et al. 2003), Chinese 
has itself been transformed. 

Thanks to this wealth of contact phenomena, China has been a testing ground for 
theories of substrate influence, structural borrowing, areal diffusion, and contact- 
induced grammaticalization. Much of this interaction has yet to be uncovered. 
Especially in the southwestern provinces, hundreds of poorly documented 
minority languages have undergone transformation under Chinese influence, 
and in the process have left their mark on local forms of Chinese. This promises 
to be a rich area of research as the documentation of minority languages proceeds. 


NOTES 


1 Cantonese examples are given in the Jyutping romanization system developed by the 
Linguistic Society of Hong Kong, in which tones are notated by numbers from 1 (high 
level) to 6 (low level). 

2 In the following sections these abbreviations have been used in the sample data: CL = 
CLASSIFIER, EXPER = EXPERIENCER, PERF = PERFECTIVE, PL = PLURAL, PROG = 
PROGRESSIVE, SFP = SENTENCE-FINAL PARTICLE. 

3 The names of dialects often reflect the pronunciation of place names in the local 
dialect. Thus Hokkien is the southern Min pronunciation of the name of the province 
Fujian, while Teochew reflects the pronunciation of the place name (Chaozhou in 
Mandarin) in the Chaozhou dialect. 

4 Malay also has a preverbal adverb pernah with similar experiential semantics. In this 
and some other areas, Malay may have contributed to SCE or reinforced the influence 
of southern Chinese dialects. 
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38 Contact and Indigenous 
Languages in Australia 


PATRICK McCONVELL 


1 Introduction 


Throughout the nineteenth and early twentieth century many of the original 250 
languages of Australia ceased to be spoken, especially in areas where the indigen- 
ous peoples’ land was intensively settled and farmed by the newcomers. Since 
that time there has been a collapse of the languages, with fewer than 20 currently 
being passed on to children (McConvell, Marmion, & McNicol 2005; Walsh 2005; 
McConvell & Thieberger 2006). Many of those who have shifted away from tra- 
ditional languages speak distinctive Aboriginal dialects of English and, in the north, 
varieties of an English-based creole language (Kriol and Torres Strait Creole). 

The present population of indigenous people in Australia is around 400,000, 
Assuming the population at early colonization to have been about 750,000, the 
average size of a language group (including several dialects) would have been 
3,000. Some groups were much smaller, of the order of 100-200, while a few may 
have been considerably larger, but probably no more than about 5,000-—6,000 
speakers. The low population sizes are not unusual for hunter-gatherer groups 
in many areas of the world, but they may have implications for language contact 
phenomena as exposure to neighboring languages and multilingualism is more 
common than with languages with larger territories and populations (e.g. Sutton 
1978; Brandl & Walsh 1982). 

The first topic covered in this chapter is the role of language contact and dif- 
fusion in the history of Australian indigenous languages. Much controversy sur- 
rounds this issue as it has been suggested, particularly by Dixon (1997; 2002), that 
the role of diffusion and convergence is much more significant in Australia than 
elsewhere, and that the conventional models of language families generated by 
the comparative method in linguistics may be inapplicable in Australia for this 
reason (Dixon 2002: 699). This view is contested by other Australianist linguists 
(e.g. Alpher 2005; Sutton & Koch 2008). 

Beyond this debate some of the studies of the intertwining of inheritance 
and diffusion in some areas of Australia are examined, and their contribution to 
understanding prehistory more generally. 
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Attention then turns to the new contact languages and contact interactions 
of recent times in Australia, including the early pidgins, and the creoles which 
they engendered when they became the first language of children of speakers 
of traditional indigenous languages. As well as shift to these creoles, the mod- 
ern situation includes examples of heavy influence of English and creoles on 
the traditional indigenous languages, and “mixed languages” emerging from 
hybridization of the traditional languages with creoles. 


2 The Role of Diffusion in the History of 
Indigenous Languages 


2.1 Approaches to comparative linguistics in Australia 


The standard approach to reconstructing families based on the comparative 
method in linguistics, in which the recognition of language contact effects plays 
a complementary role to that of inheritance of linguistic features, has had a slow 
and faltering start in Australia. 

With the work of Ken Hale and Geoffrey O’Grady in the 1960s (e.g. O’Grady, 
Voegelin, & Voegelin 1966; O’Grady & Hale 2004) the notion of a Pama-Nyungan 
language family was proposed, encompassing most languages in Australia 
except for the central north. Dixon also began comparative work and proposed 
features of a “proto-Australian” (1972; 1980). Later commentators saw this con- 
struct as biased toward what are generally considered as Pama-Nyungan languages, 
and while Dixon’s second book on Australian languages (2002) was intended in 
part to right this, the bias continues — significantly by explaining the distinctive 
typological features of Non-Pama-Nyungan by processes of diffusion, as well as 
emphatically rejecting the notion of a Pama-Nyungan family. 


2.2. Heath on Arnhem Land 


Jeffrey Heath carried out fieldwork on a number of languages of Eastern Arnhem 
Land in the 1970s and published a detailed study of linguistic diffusion in the 
area (Heath 1978; 1981), which had major impact both in Australia and overseas. 
Some formed the impression from this work that linguistic diffusion, both of 
forms and structures, was exceptionally common in this region and in Australia 
in general. This is not a conclusion of the book, and certainly the results do not 
support conjectures (to be examined below) that diffusion is so prevalent that 
linguistic subgrouping using the classic comparative method is not feasible in 
Australia. Rather the reverse: Heath carries out the subgrouping task successfully 
in tandem with his attention to borrowing, with the two acting as complemen- 
tary and indispensable to each other. The general conclusion is that direct 
borrowing of affixal morphemes is more common, but convergence of mor- 
phosyntactic structures is much less common, than in the classic case of areal 
convergence of grammatical typology with retention of distinct lexicon and mor- 
phological forms described by Gumperz and Wilson (1971). Heath ascribes this 
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difference to there being less need to use emblematic forms to assert linguistic 
identity in Australian indigenous situations and less use of code-switching 
between languages in Arnhem Land historically than in India (1978: 142-3). 

While Heath studied borrowing between a number of language pairs in the region, 
probably the most striking are borrowings between the Non-Pama-Nyungan 
languages and the Yolngu (Pama-Nyungan) language Ritharrngu. Yolngu is a 
outlier of the Pama-Nyungan family in northeast Arnhem Land completely sur- 
rounded by Non-Pama-Nyungan languages. The Non-Pama-Nyungan languages 
are head-marking with complex verb prefix morphology including bound 
pronominal forms for subject, object, and other functions, and noun classes 
marked by prefixes. The Yolngu languages are dependent-marking with no 
pronominal affixes on the verb and no noun classes. As well as this typological 
contrast, the basic lexicon of Yolngu is very different from that of the Non-Pama- 
Nyungan languages, although some lexical borrowing has occurred in both 
directions. 

Affixes borrowed include the ergative suffix -dhu from Ritharrngu into Ngandi 
as -thu (Heath 1978: 76-7). The borrowing of ergative suffixes into Non-Pama- 
Nyungan from neighboring Pama-Nyungan languages seems to have occurred 
elsewhere in border regions (e.g. Gooniyandi, McGregor 1990: 179). 

Heath also claims that noun class prefixes were diffused through the Non-Pama- 
Nyungan languages of Arnhem Land. For instance forms found in Warndarang 
appear to be related to forms in the Nunggubuyu-Ngandi family of languages, 
not to anything in other languages in the subgroup to which Warndarang 
belongs, which includes Mara and Alawa. Evans (2003: 16) thinks Heath may 
be overstating diffusion here, as while there may be no cognates retained in the 
subgroups where the class prefixes are found, there may be cognates in more 
distant languages outside Eastern Arnhem Land, suggesting that these are old 
inherited items which have been lost in some subgroups. 

Heath concludes (1978: 105) that there are various categories of morphemes which 
are either not diffused at all or very rarely: these include (1) verbal inflectional 
affixes; (2) bound pronouns; (3) independent pronouns; and (4) demonstratives. 
Behind this he sees a borrowability hierarchy which predicts in general terms which 
items are more prone to borrowing than others. Those items which diffuse tend 
to be syllabic; have clear boundaries; be unifunctional (not portmanteau); have 
categorical clarity (do not depend on environment for determination of their func- 
tion); and have “analogical freedom” — absence of analogical pressure such as is 
exerted by free pronouns on bound pronominal systems. He finds that functional 
considerations — e.g. that a borrowed morpheme is filling a “functional gap” 
in the target language - are of very little significance (1978: 117). Heath also 
analyzes possible cases of structural diffusion (section 2.8 below). 


2.3 From equilibrium to “punctuated equilibrium” 


Dixon had early on begun to see diffusion as a key theme in Australian linguistics. 
One of the reasons for the apparent high rates of diffusion of lexical items was, 
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he argued, the practice found all over Australia of the tabooing of words sound- 
ing similar to a dead person’s name, to which the solution for filling the gap was 
often a loanword. The prevalence of diffusion led to a phenomenon of “equilib- 
rium” whereby languages next to each other for a long period would end up shar- 
ing around half of their lexical items. This generalization has been challenged by 
a number of authors, notably Alpher and Nash (1999; see section 2.4 below). 

Dixon’s later work launched the idea of “punctuated equilibrium” and began 
to suggest that the comparative method did not work, especially in Australia. The 
idea is that only “punctuations” — bursts of language split and spread — yield 
classical family trees of languages. However those language groupings which 
can still be analyzed as cladistic (tree-like) are typically associated with relatively 
recent spreads of languages. In cases like Australia, according to this theory, punc- 
tuations following initial colonization of the continent perhaps 40,000 to 50,000 
years ago have long ago been overlaid by massive diffusion of all kinds of 
linguistic items, to the extent that any original families or subgroups cannot be 
discerned. 


2.4 Does diffusion vitiate the comparative method 
in Australia? 


The picture being painted by Dixon does not accord with basic observations that 
linguists have been making about the relationship of languages. The languages 
are clearly capable of being grouped into sets based on shared characteristics and 
the level of similarity is not a function of geographical proximity as it would 
be under the “equilbrium” theory. People who have worked along the Pama- 
Nyungan/Non-Pama-Nyungan border in Northern Australia are constantly con- 
fonted with the major differences in grammar and vocabulary between these 
different adjacent groupings of languages. There is borrowing of some elements 
of vocabulary and a few examples of partial convergence grammatically, but the 
languages remain starkly different in their core. 

Strong evidence for the existence of families and subgroups in Australia, 
including the very large Pama-Nyungan family of languages, is now being 
amassed in articles, books, and databases. A requirement for establishment of 
families and subgroups in the comparative method is a bundle of shared inno- 
vations which defines the group, in most cases including regular sound changes. 
Studies in Bowern and Koch (2004) establish groupings in Australia in this stan- 
dard way. 

Dixon sets the bar very high for accepting an item as evidence for subgroup- 
ing. For instance he claims that the root ngali, the first personal dual pronoun ‘you 
and I’ in most Pama-Nyungan languages, is not an item inherited from proto- 
Pama-Nyungan, as most Australianist linguists believe, but a loanword with an 
extremely wide distribution which happens to coincide with what other linguists 
see as the area of Pama-Nyungan (2002: 277-82). One of Dixons’s arguments against 
the proposal that it is a proto-Pama-Nyungan form is that it is not found in 7 of 
the 200 or so Pama-Nyungan languages. It is very rarely the case for any of the 
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well-attested language families or subgroups in the world that a proto-root 
has reflexes in all or nearly all of the daughter languages, and is not seen as a 
criterion for reconstruction to a protolanguage in standard linguistics. For rebut- 
tal of specific diffusion scenarios for ngali see Evans (2005: 264-8; Sutton & Koch 
2008: 484). 


2.5 Levels of lexical borrowing 


Dixon has consistently argued (e.g. Dixon 2002: 27-30) that diffusion between adja- 
cent distantly related languages causes their levels of shared vocabulary to rise 
to something approaching 50 percent, thus calling into question use of lexicostatistics 
as an indicator of subgrouping. Alpher and Nash (1999) argue that the model is 
not correct and one would rather expect levels in the order of 25 percent. Evans 
(2005: 258-61), Black (2006), and Sutton and Koch (2008: 493) support this with 
a survey of counts which shows that 25 percent is rarely exceeded. In the cases 
where the figure is higher, special factors are in play. One of these may be the 
presence of both an adstratal and substratal non-inherited component, as perhaps 
in the case of Gurindji which has figures around 35 percent reported by 
McConvell (to appear b). 

There is no doubt that borrowing between languages is a major factor, in some 
cases elevating shared vocabulary counts. Dixon disputes that type of vocabulary 
“core” versus “non-core cultural”) plays a role. The standard method of vocabu- 
lary comparison as used by O’Grady and Hale for instance involved use of the 
Swadesh list or variants and lexicostatistics. Black (1997) shows that the assump- 
tion that basic core vocabulary is less resistant to change stands up well in 
Australia. Bowern (2006: 254-7) retests one example of the levels of shared 
vocabulary between unrelated languages which Dixon (2002: 46-7) claims to be 
relatively high due to “equlibrium” diffusion, Karajarri (Pama-Nyungan) and Bardi 
(Non-Pama-Nyungan) in the western Kimberley, and finds that the shared level 
is much lower than in Dixon’s findings: for core vocabulary it is 8 percent, and 
non-core 21 percent, also contradicting Dixon’s assertion that there is no appre- 
ciable difference between these two categories of vocabulary in such measures. 


2.6 Borrowing between dialects and closely 
related languages 


As elsewhere in the world, there are instances where sound changes are not, on 
the face of it, regular. Many of these are due to borrowing of items including 
“reversed change borrowing” which can produce the appearance of partial 
application of a sound change to some of the lexicon (McConvell 2008c). 
Dench (2001) for the Pilbara of Western Australia and Black (2004) for Cape 
York Peninsula in Queensland are less confident that all irregularity can be 
explained in this way to allow firm subgrouping in all cases, based on shared 
innovations. The problems raised by these authors are however not unique to the 
Australian continent, and can be addressed by a combination of persistent work 
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on recognition of well-grounded diffusion processes in contrast to inheritance, and 
where necessary, recognizing “dialect mixing” as a feature of some reconstructed 
protolanguages (Bowern 2006). 


2.7 Linguistic stratigraphy and cultural history 


One of the main methods for detecting loanwords is the fact they have not under- 
gone processes which have affected the inherited vocabulary, such as regular sound 
changes. Importation of distinctive roots and morphology from other languages 
also often reveals a clear picture of sequence and direction of loans. 

The possibility does exist for the absolute dating available from archeology to 
provide calibration, as when a new artifact has arrived in an area and its horizon 
can be dated and additionally a new loan term for it or associated practices has 
arrived, presumably at the same time, which has distinctive patterns, e.g. absence 
of a sound change. One study which provides something approaching this kind 
of case is that of the term for muller (top grindstone) in Ngumpin-Yapa — 
marang(u) (McConvell & Smith 2003). Other work in linguistic stratigraphy 
has dealt with the diffusion of the subsection system and its terminologies 
(McConvell 1985; 1997; Harvey 2008). Gaby (2008) provides further examples of 
current work on language contact and its cultural and historical dimensions. 


2.8 Structural diffusion and areality 


Dixon (2001) refers to Australia as a vast “linguistic area” (see Bowern 2006 for 
a critical appraisal). In phonology there is a level of conformity not found on other 
continents: a small array of quite similar consonant inventories generally lacking 
fricatives and often without a phonemic voiced/voiceless distinction; dominance 
of three-vowel systems (i, a, u) with some five-vowel systems in languages of 
the north. These features may be due to diffusion resulting in convergence but 
may also be due to long-term maintenance of ancestral sound systems. 

There is some structural diffusion — borrowing of syntactic and morphological 
patterns rather than actual forms — but it probably does not have the dominant 
role sometimes claimed. Heath investigated “indirect morphosyntactic diffusion” 
in Arnhem Land which he defines (1978: 119) as “the process whereby one lan- 
guage rearranges its inherited words and morphemes under the influence of a 
foreign model, so that structural convergence results.” But on his own admission 
it is harder to substantiate than diffusion of actual forms. One of his cases is the 
hypothesis that the development of enclitic pronouns in Ritharrngu, as in (1) below, 
is due to the influence of neighboring Non-Pama-Nyungan languages such as 
Ngandi. The more common pattern — and no doubt the inherited one — in Yolngu 
is for the full independent pronouns to occur and there are no clitic or bound 
pronouns. The argument is that a typical sentence structure in Ngandi consists 
of a verb with bound pronouns followed by a right-hand subject NP (as in (1b)). 
As part of the replication of such a pattern, a Ritharrngu speaker tends to com- 
press the pronoun as part of the verb word forming an enclitic. The pattern adopted 


776 Patrick McConvell 


in Ritharrngu according to Heath (1978: 127) is that “pronominals are obligatory 
for subjects and objects even where full NP’s are present in the same clause.” 


(1) a. Ritharrngu waani-na ngay, rdarramu-ya 
“He went, the man.’ 
b. Negandi rni-rid-i rni-yul-ngu 


Structural diffusion proposals include suggestions that the feature of bound pro- 
nouns may have been widely borrowed and applied to the pronominal prefixes 
of Non-Pama-Nyungan languages (Dixon 2002: 48, 404), conforming to Dixon’s 
hypothesis that the earliest Australian languages had only free pronouns, cf. Evans 
(2005: 202-7), Harvey (2003) for a contrary hypothesis that prefixing systems are 
old and inherited. This also applies to enclitic pronouns in some Pama-Nyungan 
languages (Dixon 2007; for a contrary view that pronominal enclitics in Western 
Desert and neighbors are inherited see McConvell & Laughren 2004). Another 
feature of syntactic typology whose distribution has been ascribed to diffusion 
is “switch reference” (Austin 1981; Dixon 2002: 89; McConvell & Simpson to appear). 
Other structural features in Non-Pama-Nyungan languages have also been 
attributed to Pama-Nyungan contact influence (Green 1995; Green & Nordlinger 
2004); for an argument in the opposite direction, that internal factors are more likely 
to have caused change, see McConvell (2003). 

Another type of areal diffusion of syntactic patterns is that of complex verbs 
(Dixon 2002: 184ff.). The major structural effect of Non-Pama-Nyungan language 
contact on Eastern Ngumpin languages including Gurindji, and Ngarinyman, 
its northern neighbor, is the phenomenon of complex verbs with loose nexus 
between the coverb (the main lexical meaning-bearing element) and the light or 
ancillary verb which usually accompanies it. Example (2a) is Ngarinyman with 
the coverb lurr ‘pierce’ used with the light verb yuwa- ‘put’, to mean ‘pierce’. Highly 
parallel is the Jaminjung construction in (2b) which uses the same coverb — which 
has no doubt been borrowed from Jaminjung into Ngarinyman — and the seman- 
tically equivalent light verb ‘put’, although the stem and the verbal morphology 
are quite different (thanks to Eva Schultze-Berndt for the examples: coverbs are 
highlighted in bold): 


(2) a. Ngarinyman (Pama-Nyungan) 
lurr = yuwa-ni_wirriny-ga, 
pierce put-PAST neck-LOC 
‘He pierced (it) in the neck.’ 
b. Jaminjung (Non-Pama-Nyungan) 
lurr — gan-arra-ny=nu malajagu nawij-gi, 
pierce 3SG>3SG-PUT-PST=3SG.OBL goanna neck-LOC 
‘He pierced the goanna in the neck.’ 


Ngumpin-Yapa languages other than Eastern Ngumpin and other Pama- 
Nyungan languages also have two-part verbs but these are generally compounds 
— single words consisting of a preverb element and an inflecting verbal element. 
In contrast, complex verbs in Eastern Ngumpin (including Gurindji) are of the 
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‘loose nexus’ type: the coverb and light verb are separate words, phonologically, 
which may occur in either order and separated from each other, and in a number 
of contexts coverbs may occur without the verb. 

These characteristics are all found in the neighboring Non-Pama-Nyungan lan- 
guages to the north but not in the neighboring Pama-Nyungan languages. It is 
likely that these characteristics were adopted by Eastern Ngumpin languages from 
Non-Pama-Nyungan neighbors. Further evidence is to be found in the clear his- 
tory of replacement of Ngumpin-Yapa monomorphemic verb roots by complex 
verbs in the northeast, usually involving borrowing of Non-Pama-Nyungan 
coverbs. In this, we see mutual positive feedback between diffusion of grammatical 
patterns and diffusion of actual forms. The loose nexus complex verb arrange- 
ment allows insertion of invariant verbal forms into a frame with a light verb 
without the complications which arise when these elements are morphologically 
bound together, and this kind of phenomenon has been observed in a number of 
linguistic contact areas around the world. The more frequent and less marked this 
arrangement becomes, due to the volume of coverb loans, the more it becomes 
established as the standard pattern and reduces the numbers of monomorphemic 
verbs in the languages joining the area. 


2.9 Semantic areas 


The organization of lexical semantics also tends to be similar in areas, and these 
may include languages of several subgroups and families. In the area discussed 
above of Eastern Ngumpin, for instance, some polysemies follow the pattern gen- 
erally found in the Non-Pama-Nyungan languages to the north rather than those 
in the most closely genetically Ngumpin-Yapa related languages to the south. For 
instance the term for ‘hill’ is the same as the word for ‘head’ in Eastern Ngumpin 
languages, as it is in the Jaminjungan languages (Western Mirndi) and Wardaman 
to the north, whereas in the other Ngumpin-Yapa languages the words for ‘hill’ 
and ‘rock’ are the same and ‘head’ does not mean ‘hill’. None of the actual words 
for these items is borrowed from one family to another, but all are inherited items 
within their own respective families. 

Bundles of such isopolysemes surround certain areas. This implies similar 
categorizations of items within such areas and similar organization of indigenous 
knowledge systems dealing with, for instance, flora, fauna and other environmental 
vocabulary within them. It is likely that such areas developed through mullti- 
directional diffusion of semantic organization or by adoption of polysemies from 
a substrate. Certain types of phrasal idioms and derived forms also have an areal 
distribution due to diffusion (Austin, Ellis, & Hercus 1976) 


3 Pidgin and Creoles 
3.1 Non-English-based pidgins 


Pidginized varieties of Australian indigenous languages have rarely been 
reported. However, Dench (1998) reports on a pidgin language based mainly on 
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the eastern Pilbara language Nyamal in Western Australia in the nineteenth cen- 
tury, Nyamal was spoken 700 km from North West Cape with many language 
territories in between. The key factor was that pearling luggers plied the coast 
and had contacts with Aborigines in various places, but notably in the Nyamal 
territory, and from there this pidgin must have been spread on the boats. 
Europeans were therefore key agents in the activity which led to the pidgin 
formation, even though it was not their language that was used. Another such 
language of intergroup communication which made a fleeting appearance in 
early nineteenth-century South Australia was jargon Kaurna, a version of the local 
indigenous language considerably simplified (Simpson 1996). 

Seafaring is also the background for another early pidgin based on elements of 
Austronesian languages of Ujung Padang, known as “Macassan,” of which again 
we only have fragmented attestation (Urry & Walsh 1981) and in this case it 
is the language of the seafarers which is adapted. This emergence is related to 
the voyages of Macassans to the northern coast of Australia to gather trepang 
(sea slug) from about 1700 to 1900, which also left a legacy of loanwords espe- 
cially in Arnhem Land (see section 4.1). Another more recent example of an 
Austronesian-based pidgin is a Malay pidgin used on boats around Broome, Western 
Australia, by Asians and Aborigines (Hosokawa 1987). 


3.2 Early English-based pidgins 


A type of pidgin English was already in use between colonists and Aborigines 
not long after the first British colony was established in Sydney in 1788, and 
by the early nineteenth century was being used as a lingua franca between 
Aborigines to some extent too (Troy 1990). There is clearly some relationship 
between this language and Pacific pidgins more generally, but what that relationship 
is has been the subject of debate. The origin of the Pacific pidgins was on ships 
and among traders in the Pacific and continued elements of more widespread ear- 
lier nautical jargons, but one view gives pride of place to the language developed 
in New South Wales as a main component of the Pacific pidgins (e.g. Baker 1996). 

As more British colonies were founded in Australia and settlement also spread 
out from NSW into Queensland, the NSW pidgin also spread and adapted as a 
lingua franca on the frontier while it became a first language of some Aborigines 
in the earlier settled parts (Dutton 1983; Mtihlhausler 1991; 1996). Leaders in spread- 
ing the pidgin were Aboriginal people who traveled with the pastoralist settlers 
into new country, and in some areas Afghan cameleers who transported supplies 
(Simpson 2000). 

Koch (2000) argues that some features of the early Australian pidgins derive 
from features of the local substrate languages in New South Wales. To the extent 
that these features are also found in Pacific pidgins more generally this work fits 
together with the idea that early Australian pidgins were one of the main sources 
of Pacific pidgins. 

One of the features of Australian pidgins and creoles which is shared with 
at least some of the other Pacific pidgins is the transitive marker on verbs -im 
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(~-Vm). This was present in a rudimentary form at least in talk reported in early 
colonial Sydney: this example is from the 1830s (Govett 1977 [1837]: 64): 


(3) Goot marning, massa, you catch him fish. 


There is no immediate parallel for such transitivity marking in NSW Aboriginal 
languages or elsewhere in the region either in Australia more generally or the 
Pacific. Koch does not argue that this feature of pidgins and creoles results from 
straightforward copying of a pattern. Rather he bases his origin hypothesis on 
the fact that many Australian (including Sydney and NSW) languages have 
zero marking for third person singular pronominal objects. On this model NSW 
Aborigines interpreted for instance “catch him” or “catch-im” as being the verb, 
without a pronoun, as in their system, so an object NP like “fish” in the above 
sentence is added to this extended verb, in the pidgin. 

This is an ingenious hypothesis and Koch (2000) develops a similar one for 
the origin of the “fellow” (Kriol -bala) suffix on adjectives in Australia (and 
the Pacific), building on the work of Baker (e.g. 1996), involving merger of fre- 
quent collocations in the superstrate into complex items which fit a grammatical 
structural pattern in the substrate. These two cases could also both be classed as 
“grammaticalization,” perhaps providing some support for Heine and Kuteva’s 
contention that language contact and grammaticalization are linked (2005). As 
Koch notes, this is just a “pilot study” at this stage; more research is needed to 
strengthen evidence for these origin hypotheses. 


3.3 “Roper River” Kriol 


During a brutal phase of the efforts of white pastoralists to “bring in” Aborigines 
leading a foraging lifestyle to work on cattle stations, the mission at Roper River 
(now Neukurr) on the western Gulf of Carpentaria in the Northern Territory around 
1908-20 intervened to collect people together who were at risk, and instituted a 
regime of dormitories for the children, partially separating them from their families. 
These families came from several different language groups in the region but the 
only common language was the English-based cattle station pidgin, which had already 
spread through a large part of the Northern Territory by the late nineteenth cen- 
tury. The children adopted and adapted this pidgin as their own language, later 
known as Roper River Creole or Kriol (Sandefur 1979; 1981; Harris 1991). Some of 
the histories portray this as a full creolization event at Roper River mission 
around 1910, but more recent research indicates that Aboriginal people in this 
area remained bilingual in traditional languages of the region until the 1930s and 
1940s, when creolization in the sense of full language shift to Kriol began to occur 
(Munro 2000: 267; 2004). This date for language shift in the Roper River region 
brings it closer to other attested dates for shift to Kriol such as the 1950s in the 
Kimberleys (Hudson 1983) and across the savannah cattle belt of the Northern 
Territory. Greater freedom of movement and employment of Aboriginal people 
associated with World War II seem to lie behind the creolization in this period. 
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Most of the vocabulary of the cattle station pidgin and Kriol was based on English 
although a number of items were imported from other Aboriginal languages, 
often those of New South Wales. The sound system was moulded in part to the 
phonological constraints of Aboriginal languages. The inventory of vowels and 
consonants in the basilectal versions of Kriol is very close to that of Aboriginal 
languages, with for instance fricatives such as f and s being replaced by stops 
b and fj (a lamino-palatal). Acrolectal varieties contain more of the English 
sounds. The grammar was distinctively different and included the following 
features. 


(4) a. lack of tense marking on verbs, the tense-aspect system being instead 
expressed by auxiliaries such as bin ‘past tense’ preceding the verb; 

b. a pronoun system which included distinctions made in Aboriginal 
languages but not in English such as inclusive and dual yunmi ‘you and 
Y’ contrasting with mindubala ‘he/she and I’; mindubala (dual) contrasting 
with mibala ‘they and I’ (plural); 
the--im transitive suffix; 
absence of articles from English; 

e. use of prepositions langa/la for locative/allative and bilanga/bla for 
possessive (for variation in the latter, including replacement by forms 
derived from English for, see Hudson 1983; McConvell 2005) 


ao 


Here is an example of modern Kriol illustrating some of these features, and also 
use of English words with different meanings in Kriol (kill = ‘hit’): 


(5) Mindubala bin kil-im guwana langa riba. 
‘We two hit a goanna at the river.’ 


The second phase of creolization in the 1950s was a rapid and widespread pro- 
cess in the cattle station area across a wide area of the Northern Territory and 
the Kimberley region of northern Western Australia. More recently Kriol has begun 
to spread to areas such as Arnhem Land where the cattle station pidgin had not 
been spoken previously. 

Munro upholds the view that Kriol was formed at Roper River and diffused 
from there. Munro (2004, extensively cited in Siegel 2008) has argued that some 
features of Kriol are the product of substratal elements in the languages around 
its proposed origin area, the Western Gulf of Carpentaria — which would support 
the idea of genesis of the wider Kriol being located at Roper River about a 
hundred years ago. 

The pidgin from which Kriol developed was already in existence across the 
Northern Territory and the Kimberleys at the time of the mission foundation at 
Roper River in 1908, and many features of Kriol go back to earlier NSW proto- 
pidgins, as Munro notes (2000; 2004), with much putative substrate influence 
dating from that time, including the features described by Koch (2000) discussed 
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above. It is debatable whether many features of Kriol can be definitively seen as 
deriving from the indigenous languages around Roper River specifically. 


3.4 Torres Strait/Cape York Creole 


There are two major distinctive creole dialects in Australia: Kriol, discussed above, 
and Torres Strait Creole, currently known as Yumpla Tok — ‘our-INCLUSIVE lan- 
guage’, but previously known as “Broken” (Shnukal 1988), spoken by Torres Strait 
Islanders (who have a Melanesian appearance in contrast to Australian mainland 
Aborigines), with a closely related variety being widely used by Aborigines in 
the adjacent Cape York Peninsula (Crowley & Rigsby 1979). 

Yumpla Tok more closely resembles Pacific pidgins than Kriol does. This is 
not surprising given the role of Pacific Islander missionaries and laborers, who 
mainly used the English-based pidgin Beach-la-mar, in introducing the Pacific 
pidgin to the Torres Strait Islands in the late nineteenth century (Crowley & Rigsby 
1979: 158). The pidgin began to creolize in the 1930s with the process related to 
the rise of Torres Strait islander skippers in the pearling industry described by 
Shnukal (1985). 

Yumpla Tok differs in a number of ways from Kriol: for instance a transitive 
marker -e/-i is used more than the form which resembles Kriol -em/-im; the 
pronoun system makes a similar number of distinctions as Kriol but the forms 
are different, e.g. yumpla first person inclusive plural as in the language name, 
although the exclusive form mipla is similar to Kriol mibala (Shnukal 1988: 30). 


3.5 Aboriginal Englishes 


It is often difficult to draw the line between acrolectal varieties of pidgins and 
creoles and indigenous varieties of English. Among characteristics of indigenous 
Englishes which have been pointed to are lack of -s marking either of possessives 
or plural; lack of tense marking on the verb or in some cases altogether. 

In many cases the traditional indigenous languages have long disappeared from 
areas where Aboriginal Englishes are spoken so the language contact which 
occurred between these languages and English only leaves its mark as a legacy 
of earlier times. The phonology and some of the grammatical features are simi- 
lar to the Australian English spoken by the non-indigenous and rural working 
class, in such cases, and the extent of this origin of nonstandard features, as opposed 
to distinctly indigenous features, is not always clear in descriptions (Malcolm 2000: 
136-40). It has been argued that indigenous Englishes maintain a distinct profile 
which not only distinguishes them from standard Australian English but also links 
them to traditional languages. Apart from phonological and grammatical features 
(Eades 1996; Malcolm 2007a, 2007b), emphasis has been placed on discourse and 
pragmatic patterns such as avoidance of direct questioning, which if valid would 
be more of a cultural than a linguistic phenomenon (Eades 1982; 1983; Moses and 
Yallop 2008). 
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3.6 Status and role of creoles and indigenous Englishes 
in education and public life 


The indigenous varieties of English and pidgins/creoles have been strongly stig- 
matized in the wider society as “bad”, “bastardized,” or “broken” English and 
this attitude is often reflected within indigenous society itself, even by those who 
speak these varieties as a first or second language. Attempts to accord a more 
accepting and legitimized status to these dialects have included bilingual educa- 
tion in Kriol at Ngukurr and Barunga in the 1970s and 1980s (programs later ter- 
minated), efforts to encourage greater understanding of these issues in education 
(e.g. Eagleson, Kaldor, & Malcolm 1982; Harkins 1994; Berry & Hudson 1997; Leitner 
& Malcolm 2007), and interpreter services for Kriol in some places. 


4 Contemporary Contact Effects, Language Shift, 
and Mixed Languages 


4.1 Lexical borrowing from English and other non- 
indigenous languages in indigenous languages 


There has been significant publication on loanwords from indigenous languages 
into Australian (and in many cases World) English (e.g. Dixon, Ramson, & 
Thomas 1990), like kangaroo from a word for a type of kangarro in Guugu 
Yimidhirr in Far North Queensland, recorded by Cook on his expedition to 
Australia in 1770. 

Loanwords from English and pidgin into indigenous languages have received 
less attention. In Gurindji for instance, around 10 percent of loanwords and less 
than 5 percent of total vocabulary are from English or pidgin (McConvell, to appear 
b). New terms for new things were also coined (fiwu-waji ‘flying thing’ for aero- 
plane), existing vocabulary extended (wajirrki ‘dragonfly’ for helicopter; warrayal 
‘sand’ for sugar, cf. Walsh 1993: 121 for the same equation in Murrinh Patha), or 
words were borrowed from other indigenous languages (nalija ‘tea’ from Mudburra 
originally meaning ‘waterweed, algae’). Otherwise English or pidgin words are 
used, not as true loanwords but as code-switching insertions (see below, section 4.2), 
rather than being fully integrated into the language phonologically. 

Indigenous loanwords traveled long distances from their origin, borrowed 
along chains of languages or brought by individual Aboriginal people moving as 
part of the new regime and new industries such as cattle, fishing, and pearling 
(McConvell & Thieberger 2005). Yawarta ‘type of kangaroo’ in southwestern 
Australia moved north as a word for ‘horse’ and is widely used in the 
Kimberleys and neighboring parts of the Northern Territory (for other long- 
distance borrowing of words for ‘horse’ see Walsh 1992). 

On the north coast there was significant borrowing of Austronesian words used 
by the sailors who visited the area over the last 300 years or more searching for 


Contact and Indigenous Languages in Australia 783 


marine products, mainly trepang or sea-slug. The loanwords number hundreds 
of items in some languages (Walker & Zorc 1981; Evans 1997). Among them are 
many for introduced cultural items such as rrupiya ‘money’ and lipalipa ‘(type of) 
canoe’, as well as ceremonial vocabulary due to incorporation of elements with 
Macassan sources in ritual. 


4.2 Domains and code-switching 


Jernudd (1971) in his sociolinguistic account of some Northern Territory situ- 
ations regarded Aboriginal traditional languages and English as being in a diglossic 
situation, with English used in most public contexts and the traditional languages 
in home and family situations mainly. However for the English-based creoles and 
codes involving mixtures of English and traditional languages, on the one hand, 
and the traditional indigenous languages, on the other, he could see no diglossia 
between them. They were, he thought, used in the same range of situations because 
of rapid social change and lack of clarity in the community about domains 
(Jernudd, 1971: 17, 21). He also identifies the limitations of the domains approach 
and the advantages of the “metaphorical” approach to code-switching, citing Blom 
and Gumperz (1972), which was a precursor to the “social meaning” approach 
where functions of code choice are not tied to “domain.” 

Code-switching between languages has been reported for a number of places 
in Australia, both intersentential and intrasentential (insertional), e.g. Bani (1976). 
While focus has been on code-switching between a traditional indigenous language 
and an English-based variety, there is evidence also of it occurring between 
traditional languages and we might infer that this is a longstanding practice not 
a recent innovation due to colonization. This runs somewhat counter to the view 
of Heath, cited above, who doubted that it was much practised traditionally in 
the area of Arnhem Land he studied. 

Elwell (1982) however reports code-switching as a feature of the multilingualism 
of the Maningrida settlement not far from Heath’s study area where several 
languages were and still are spoken belonging to distantly related subgroups and 
families. The settlement, which brought together the language groups to live together 
more or less permanently, is a result of colonization, but multilingual camps and 
groupings would have been a feature of pre-colonization traditional life. 

Code-switching was pervasive in nearly all conversations among the Gurindji 
in the 1970s and 1980s (McConvell 1988). Different languages — local western dialects 
of Gurindji, a more standard eastern variety, Kriol, and English are mixed in a 
single discourse where the situation, interlocutors, location, and topic are the same 
so “domain” analysis cannot explain language choice. Nevertheless the choices 
have pragmatic function. Using a particular language in many cases to adds a 
“social meaning” about the “social arena” in which the literal meaning of the utter- 
ance was to be construed — in this case the local dialect group, the community, 
the wider group of Aboriginal people in the “cattle belt” region, and Australian 
society in general (McConvell 1988: 110). Using a language like a shared local 
dialect calls up a set of rights and responsibilities associated with the speaker’s 
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and other participants’ position in the social arena (Myers-Scotton 1993) — for 
instance implying that people who belong to the same dialect group should share 
resources (Elwell 1982: 97 also reports this kind of function at Maningrida). 
A statement or question in such a dialect can have the force of a request. 


4.3, Contact studied as “language death”: 
Young People’s Dyirbal 


Many of the situations in which language contact can be studied in Australia are 
those in which the traditional indigenous languages are severely endangered. Some 
of the studies therefore couch their approach in terms of a “language death” or 
“language obsolescence” framework, following Dorian (1981; 1989) and others. 
This perspective may lead researchers to explain phenomena in terms of impend- 
ing death or obsolescence rather than language contact and change, but studies 
do present important data on how languages are affected structurally by obso- 
lescence (e.g. Austin 1986; McGregor 2002). 

The best-known study of the sociolinguistics of a severely endangered language 
in Australia is Annette Schmidt’s (1985) study of Young People’s Dyirbal (YPD), 
spoken in the North Queensland region around Cardwell and Tully in the 1970s 
and 1980s. Although a local form of Aboriginal English has been used for many 
years the changes in YPD are not studied in terms of contact with English, and 
indeed some do not appear to be related to English or pidgin influence directly. 

However, much of the structure and content of YPD is clearly related to local 
Aboriginal English, but grammatical elements are retained from traditional 
Dyirbal beyond simply vocabulary items. These include Dyirbal case-marking (albeit 
in a simplified form) including the ergative case, 

Schmidt shows that the changes are not uniform across the young people in 
the community but vary in different “gangs.” The two that she studied were the 
Buckaroos, who favour a “cowboy” style of dress, music, and behavior, related to 
those Aboriginal people in the region who have been in the cattle industry, and 
the Rock’n’Rollers who, as the name implies, favor rock and roll music and have 
more attachment to some traditional styles and beliefs of the Aboriginal people 
of the region, including the Dyirbal language. These stylistic preferences are also 
reflected in the lects which they adopt when speaking their version of “Dyirbal.” 

The use of the ergative case-marker, is very high (around 90 percent) in the 
Rock’n’Rollers whereas it is very low (around 10 percent) among the Buckaroos. 
On the other hand, the past tense marker bin from nearby varieties of Aboriginal 
English (also found in Kriol) is entirely absent from Rock’n’Rollers’ speech but 
present in about 50 percent of the Buckaroos’ past tense clauses. 


4.4 Radical change and hybridization: Modern Tiwi 
and Gurindji Kriol 


While Young People’s Dyirbal, dealt with above, is arguably a case of radical change 
brought about by contact with an Aboriginal variety of English, it is not analyzed 
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in that way in the source. Other situations are more obviously cases of hybridiza- 
tion where there is mixing not only of the vocabulary of two languages in con- 
tact but also of their grammars (McConvell 2002; 2008a; 2008b). 

The language called “Tiwi” is still spoken by children and young people on 
Melville and Bathurst Islands offshore from Darwin in the Northern Territory, 
even though it has undergone radical change away from its original form. The 
traditional form of the language had extremely complex verb forms (of a 
“polysynthetic” type). It was spoken with little change until around the 1950s when 
the language of the younger people began to display major differences from that 
of the older generations, including simplification of verb forms and many more 
words imported from English and pidgin. This dramatic change coincided largely 
with the full impact of the Catholic mission regime, along with the housing of 
children in dormitories. In other cases such isolation from parents and their lan- 
guage in institutions has been seen as responsible for shift to Kriol and English, 
as at Roper River. The children, however, did not suffer complete removal from 
interaction with Tiwi speakers. A possibly relevant factor is the pervasiveness of 
code-switching between Tiwi and pidgin or English both today and at the time 
when Modern Tiwi, as Lee (1987) calls the new variety, arose. 

The example below gives an idea of some of the kinds of changes in Modern 
Tiwi; the prepositional phrase with prepositions from English/pidgin is bolded: 


(6) a. Traditional Tiwi 
ngu-mpu-nginji-kuruwala 
I-NPST-you DAT-sing 
‘T will sing for you.’ 

b. Modern Tiwi 
yi-kirimi jurra fu ngawa 
he-PAST-make church for us 
‘He made a church for us.’ 


Traditional Tiwi verbs are complex of the polysynthetic type. A verb stem occurs 
at the end of the verb word, and is preceded by prefixes which represent the time 
of the event and other elements of the sentence including incorporated nouns. 
Modern Tiwi verbs keep the subject, tense, and aspect prefixes but not the prefixes 
for objects and other elements; so in (6b) ‘for us’ is expressed as a prepositional 
phrase, contrasting with Traditional Tiwi (6a) where ‘for you’ is expressed in the 
verb (nginji ‘you’). 

Further south in the Northern Territory, studies have recently been carried out 
of two mixed languages originating in contact and bilingualism involving a tra- 
ditional indigenous Pama-Nyungan language of the Northern Territory, Warlpiri 
and Gurindji respectively, and Kriol (O’Shannessy 2005; 2006; McConvell & 
Meakins 2005; Meakins 2007). Below, Gurindji Kriol is described: the situation with 
Light Warlpiri is quite similar in many relevant respects. 

Gurindji was mentioned above as an example of a community which exhibited 
pervasive code-switching in the 1970s and this is a key element in the transition 
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to the mixed language. In this code-switching, case-marking of NPs and other mainly 
nominal morphology was retained in Gurindji. Lexical (content) items were drawn 
from both languages, including many of the Gurindji coverbs (uninflecting ele- 
ments usually occurring with inflecting verbs in Gurindji) taking the role of verbs 
in the new mixed language, Gurindji Kriol. 

This code-switching style, and particularly this default pattern of split between 
elements from Gurindji and Kriol, was stabilized into a mixed language, Gurindji 
Kriol, as children acquired it as their first language in the 1960s to 1980s. This 
was decribed by Gurindji speakers working with McConvell in Dalton et al. 
(1995, termed “Gurindji Children’s language”), Charola (2002), and Meakins (e.g. 
Meakins 2007) to whom I am indebted for some of the data in this chapter). 

The patterns of Gurindji Kriol mixed language are due to the most frequent 
and salient input to child learners from adults in the 1960s to 1980s being 
Gurindji-Kriol code-switching, combined with declining proficiency in tradi- 
tional Gurindji among most young people. In Gurindji Kriol, traditional Gurindji 
pronominal enclitic marking and inflecting verbs with their inflections were lost. 
This provides a counter-example to Bakker’s generalization (2003: 129) that mixed 
languages do not arise from code-switching. 

Gurindji Kriol exhibits a split between what might be termed verbal and nom- 
inal systems, as do other mixed languages like Michif (Bakker 1997; McConvell 
2008b). 

The hybridization in Modern Tiwi is closely parallel to that in Michif. 
However, the source language for each component in Gurindji Kriol is the 
reverse of Michif (where the old language, Cree, is the source of the verbal 
system and the new language, French, the source of nominal systems). The fol- 
lowing two sentences from a Gurindji Kriol story illustrate some of these features 
(McConvell & Meakins 2005). Gurindji elements are in italics; other elements are 
drawn from Kriol. 


(7) Gurindji Kriol, recorded in 2002 
nyawa-ma karu bin plei-bat  pak-ta  nyanuny 
this-TOP one child-PAST play-CONT park-LOC 3SG.DAT 
warlaku-yawung-ma. 
dog-having-TOP 
‘This one kid was playing at the park with his dog.’ 


kamon warlaku partaj ngayiny leg-ta... 
come.ondog = go.up 1SG.DAT leg-LOC 
‘Come on dog jump up on my leg...’ 


In this excerpt, most of the verb phrase morphology is derived from Kriol — past 
tense “bin,” continuative “-bat.” Elements from Gurindji are emphatic and pos- 
sessive pronouns nyanuny ‘his’ and ngayiny ‘my’, locative marker -ta, proprietive 
-yawung and demonstrative nyawa ‘this’. Both languages contribute content 


Contact and Indigenous Languages in Australia 787 


words — Gurindji: warlaku, karu, partaj; and Kriol: “plei,” “leg.” Partaj ‘go up’ is a 
coverb in traditional Gurindji which usually occurs with a verb e.g. partaj yanta 
‘go up go’, but in Gurindji Kriol can occur alone as a verb. 


4.5 Koineization: Dhuwaya 


The case of Dhuwaya among the Yolngu people of Yirrkala is where a “baby talk,” 
“motherese” variety has become the language of the young people (Amery 1993). 
At the stage that Rob Amery studied it (Amery 1985) this was the situation; now, 
over 20 years later, middle-aged people also speak it much of the time. All the 
youth of the Yirrkala community now speak this same variety, whereas the tra- 
ditional situation was for several “patrilects” to be spoken — each dialect being 
distinctive of a clan. Dhuwaya represents a koine — a merger of the motherese 
varieties of all the Yirrkala dialects. 

The new language Dhuwaya as well as the patrilects were still spoken by the 
community, including the young children, at least at the time of Amery’s 
research. Amery (1985: 135) describes the situation as “diglossia turned on its head” 
because in Yirrkala it is the “local dialects” — the patrilects - which are the H (High) 
varieties, and the language which straddles all the groups (Dhuwaya) which is 
the L (Low)variety. He illustrates this in terms of the domains in which each of 
these are typically used: patrilects are used in meetings and in church, for public 
announcements, talking to distant kin, and, perhaps incongruously, drunken 
talk. Talk at home, among peer groups and close kin and in sports contexts, is in 
Dhuwaya on the other hand. 

The actual differences between Dhuwaya and the patrilects are fairly minor, 
consisting of changes of a few consonants to weaker sounds (as in the change 
of “1” to “y” in the language name Dhuwaya compared to Dhuwala, meaning 
‘this’). To judge by Amery’s examples, there is little influence of English or 
importation of English vocabulary into Dhuwaya, although code-switching into 
English is rife in spoken Yolngu of most varieties. 


4.6 Language maintenance with minor change: 
teenagers’ Pitjantjatjara 


The chain of dialects known as the Western Desert language remains a “strong 
language” in the sense that it is still learned by children in a number of areas. 
Langlois’ (2004) study shows that one of these dialects, Pitjantjatjara, was used 
by young people at Areyonga in 1994-5 as their major mode of communication 
and is a vibrant language open to some innovation and change. Such changes as 
there are, are relatively minor and are not seen as a symptom of language shift 
or language death. Some of the changes are not related to English “interference,” 
while others may be, but are of a kind also commonly found in different con- 
texts. For instance, in traditional Pitjantjatjara, as in many Australian languages, 
the way to express ownership of a body part “my stomach” is to juxtapose the 
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owner and the body part: “me stomach,” while the way to express ownership of 
other things is to use the dative suffix -ku on the owner “me -ku book” = ‘my 
book’. But young people are now extending the suffix -ku for ownership of body 
parts. In teenage Pitjantjatjara (8a) is found in contrast to the traditional (8b): 


(8) a. ngayu-ku tjuni pika 
my stomach sick 
‘I have a sore stomach’, as opposed to the traditional: 
b. ngayulu tjuni pika 
me stomach sick 


However there are examples of just this kind of change in many languages that 
are not in contact with English, around the world, and it seems likely that the 
alienable/inalienable distinction is fragile and easily lost (McConvell 2005; 
Nichols 1988). 


5 Conclusions 


This chapter has reviewed the research and debates surrounding the significance 
of language contact in the history and development of Australian languages, in 
the long term before colonization, and in the recent changes that have followed 
on colonization as the influence of English varieties have become stronger 
and many of the traditional indigenous languages have become endangered or 
been lost. 

There have been significant studies showing the importance of language con- 
tact and linguistic diffusion in indigenous Australia, reflecting the small size of 
language groups and the predominance of multilingualism. It is not clear how- 
ever that these characteristics mean that standard comparative linguistics does 
not work in Australia. While language contact diffusion is significant, care needs 
to be taken not to overindulge in structural diffusion scenarios and always to give 
the alternatives, inheritance and independent invention, adequate consideration. 
Levels and types of borrowing and contact influence varied between languages 
and it is important to try to understand the reasons for this, probing the socio- 
linguistics of traditional societies and the incidence of such phenomena as code- 
switching, pidginization/creolization, language hybridization, and language shift. 
Here the more accessible information about recent phenomena of this kind, while 
occurring in the different context of European colonization, may provide clues to 
earlier scenarios. When the linguistic contact phenomena in Australia are better 
understood this knowledge will throw light on the prehistory of Australia and 
on the language dynamics of hunter-gatherers more generally. 
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1 Introduction 


For the purposes of this chapter, the geographical region of New Guinea will be 
defined as that area of the southwest Pacific, excluding Australia, in which lan- 
guages not belonging to the Austronesian language family can be found, roughly 
from the easterly Indonesian islands of Halmahera, Timor, and Alor in the west 
(125° E) to the westerly island group of New Georgia in the Solomon Islands in 
the east (155° E), with a land area of approximately 850,000 square kilometers or 
approximately the size of the Australian state of New South Wales. Within this 
area are crammed some 1,200 languages or about 20-25 percent of the world’s 
total, of a bewildering multiplicity of language families, groupings, and isolates 
— a linguistic diversity unparalleled elsewhere on the globe. Some quarter or so 
of these languages do belong to the Austronesian language family, but the rest 
do not, but rather fall into (or fail to do so) a number of distinct language 
groupings, called non-Austronesian to highlight their strictly negative lumping 
together — that is, these are languages of the region which are not Austronesian, 
or more commonly Papuan, again with no genetic commonality presupposed 
by this label. Although diversity is the hallmark of the New Guinea region, far- 
reaching processes of language convergence have been operative for millennia. 
Examples of diffusion of all aspects of language structure, from lexical items to 
bound morphemes, can be identified, but some do seem more resistant than others, 
with bound morphology standing out in this regard. Because of the high struc- 
tural similarity of many non-Austronesian languages across diverse families, 
itself probably the result of convergence over millennia, and our poor knowledge 
of their prehistory, specific cases of borrowing can sometimes be very hard to 
identify. Austronesian languages are on the whole quite different, and further, 
have many congeners outside the region, allowing us to ascertain with some 
confidence Austronesian forms and structural traits. Consequently, much of 
what is known of language convergence in the region concerns contact between 
Austronesian and Papuan languages, although in this paper, I will broaden this 
base to discuss some cases of language contact between Papuan languages. 
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2 Multilingualism and Bilingualism 


In many parts of the New Guinea region, multilingualism is a common feature. 
While today this typically means competence in a national lingua franca like Tok 
Pisin in Papua New Guinea or Indonesian in Papua, in earlier times multilingualism 
entailed speaking one’s vernacular or village language in addition to one or more, 
typically adjoining, regional languages, which due to the great diversity may or 
may not have been genetically related to the vernacular. The extent of traditional 
multilingualism in New Guinea often depended on overall community size. 
Small or very small language groups are much more likely to exhibit extensive 
multilingualism. The Sepik language Yelogu is spoken in a small village of 
around 63 inhabitants (Laycock 1965). All speakers are bilingual in their own and 
the distantly related language Kwoma, which has several thousand speakers. Also 
in the Sepik area, Karawa is spoken in a village of the same name by around 
60 speakers (Ferree 2000). Karawa villagers have long been bilingual in the 
neighboring closely related language Bouye, with about a thousand speakers; they 
are in fact now giving up their vernacular in favor of that of their numerically 
superior close relative. Such language shifting as a result of bilingualism was 
undoubtedly very common in the New Guinea region in past times. In the high- 
lands, a similar situation obtains with Binumarien (Oatridge & Oatridge 1973). 
Now only a little over a hundred strong, this group was once much larger, but 
its population has been reduced due to tribal fighting and prolonged residence 
in the malarious Markham valley. Many of the men among the Binumarien are 
competent in one or more of the three adjoining languages — the related Tairora 
and Gadsup and the unrelated Austronesian language Adzera; the prognosis for 
this language is doubtful. 

With larger language groupings, bilingualism may be more selective, restricted 
to border communities or to individuals, generally men, with extensive outside 
contacts through trade or exogamous marriage contracts. In the latter case, multi- 
lingualism may be viewed as an index of status, correlated as it is with higher 
economic positions in the community. The restriction of multilingualism to border 
villages was noted by Berndt (1954) with regard to the Eastern Highlands of Papua 
New Guinea. Usarufa, a small language of around a thousand speakers, is sur- 
rounded by the much larger and distantly related languages Fore and Kamano- 
Kanite, each with over ten thousand speakers. Bee (1965) claims that most adult 
Usarufa can speak Fore and Kamano-Kanite, but few Fore and Kamano-Kanite 
speakers can speak Usarufa, and these are restricted to border communities. 

A rich case study of multilingualism within a larger language group is provided 
by Salisbury’s (1962) study of the highlands people, the Siane, numbering then 
some fifteen thousand. Siane speakers in the area of study are bilingual in the 
(at best) very distantly related neighboring language Chuave, the population of 
which numbers eight thousand. The village Emenyo, in which Salisbury carried 
out the study, is close to the border between the two languages, and bilingualism 
within the village is extensive, involving both men and women. Many Chuave 
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women have married into Emenyo, and close trading links are kept up with the 
Chuave-speaking villages across the border. Within the Siane context there is a 
distinct prestige associated with multilingualism, which is regarded as a desir- 
able accomplishment. There are many indications of this in the culture: the use 
of foreign languages in Siane songs and the ubiquity of translations from one 
language into the other on formal occasions as a means of showing off language 
proficiency, even if such translations are unwarranted for reasons of comprehension. 
At least one very important Siane man spoke Chuave on almost all occasions within 
his own Siane-speaking village. He presumably regarded this marked linguistic 
behavior as consistent with his high social standing. 

With the exception of multilingualism in women acquired from other linguistic 
groups through exogamous marriage patterns, as in the Siane example above, it 
was generally the case in precontact New Guinea, and which continues in more 
isolated communities today, that multilingualism was a male affair. The nature 
of women’s roles in New Guinea societies is such that they tend to have few sus- 
tained contacts outside the village and hence encounter few situations which would 
encourage multilingualism. Conrad (1978) states that women among the May River 
Iwam of the upper Sepik River region do not speak to outsiders, thus obviating 
any need for multilingual proficiency. Litteral (1978) writes of the Anggor, also 
from the upper Sepik, but linguistically unrelated to the Iwam, that multilingualism 
is extremely limited or nonexistent among women, except those of Amanab birth 
who have married into Anggor-speaking villages. As of Litteral’s (1978) writing, 
this attitude toward women and linguistic skills was pervasive in Anggor society, 
so that girls did not actively learn Tok Pisin in contrast to boys, and older males 
attempted to restrict the access of married women to literacy classes. The rapid 
onset of language endangerment of the vernacular in the past three decades has 
undoubtedly caused a major change in these attitudes. 

Multilingualism is, of course, the most intimate way for two languages to come 
into contact: in the mind of a single individual. Not surprisingly, multilingualism 
on such an extensive scale as obtains in the New Guinea region has yielded a 
language contact zone of an equally wide spread. It is no exaggeration to say that 
the entire New Guinea region is one enormous language contact area, its borders 
defined by the peripheries at which Austronesian languages abut Papuan languages. 
Within this massive zone, smaller areas of intensive contact can be discerned, 
such as the Sepik-Ramu basin, New Britain, the Bird’s Head Peninsula, etc. As in 
Weinreich’s (1953) classic study, the effects of this intense language contact over 
many millennia have been profound; the languages show borrowing and diffu- 
sion of traits at all levels: lexical items, phonological patterns, bound morphology, 
word order, syntactic constructions, discourse styles, genres, etc. In extreme cases 
this has led to controversy over the genetic affiliation of a language. Strong (1911) 
claimed that Maisin was an Austronesian language, while Ray (1911) countered 
that it was a Papuan language, a position also advanced by Capell (1976). 
(A similar controversy now attends Warembori, a member along with Yoke of 
a small language grouping at the mouth of the Mamberamo River in Papua. 
Donohue (1999) on balance believes it to be a Papuan language with very heavy 
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Austronesian influence; Malcolm Ross (personal communication) on the other 
hand contends that it is an Austronesian language with a heavy Papuan overlay. 
My own view is that the current evidence favors the latter classification, but much 
more research on this seriously endangered language is clearly called for.) For 
Maisin, Ross (1996) presents a strong and ultimately convincing case that it is indeed 
an Austronesian language, but one which the processes of language convergence 
due to multilingualism and language contact have forced it to restructure radic- 
ally, in both grammar and lexicon, in the direction of the Papuan languages, 
of diverse language families, which surround it on all sides. Ross (1996) terms 
this restructuring process metatypy. But the important point here is: However 
obscure the Maisin case may seem, metatypy between Austronesian languages 
and the typical Papuan language is relatively easy to identify because of the wide 
lexical and structuring divergences between these and the fact that the basic 
Austronesian grammar and lexicon can be established outside the New Guinea 
region, beyond the possible effects of metatypy from Papuan languages. This makes 
it easier to identify cases within the region where metatypy has occurred in 
Austronesian languages due to Papuan influence. Between adjoining Papuan lan- 
guages the task of recognizing metatypy is much more difficult in most cases: 
its effects over millennia have already rendered adjoining Papuan languages of 
diverse genetic groups often typologically broadly similar. Most interesting are 
cases of borrowing of core vocabulary or bound morphemes; a number of these 
will be discussed here. 


3 Borrowing 


Undoubtedly, the most common effect of language contact situations and the most 
easily diffused features of language are lexical items or words. This is clearly a 
result of their high degree of metalinguistic awareness (Silverstein, 1981) in both 
their referential meaning and their segmentability. Laypeople’s understanding 
of language in most cultures concerns words; even in New Guinea speakers 
consider their language to be documented when a dictionary appears, grammars 
simply don’t cut it. Further, nouns tend to outrank verbs in their degree of meta- 
linguistic awareness for speakers; particularly, concrete nouns are more trans- 
parently referential in their meaning (i.e. when used as the head of an NP, they 
refer, in the classic sense), and given their typically simpler morphology in most 
New Guinea languages (often none at all), noun roots are more segmentable than 
verbal roots, which commonly are buried in layers of often opaque morphology. 
Not unexpectedly, nouns are borrowed with higher frequency in language con- 
tact situations in New Guinea. An interesting case study concerns the long-term 
contact and linguistic borrowing on Umboi Island between the Austronesian 
language Mangap-Mbula and the Papuan language Kovai of the Finisterre-Huon 
family (Bugenhagen 1994). Lexical borrowings have gone in both directions. The 
words borrowed from Kovai into Mangap-Mbula fall largely into three classes: 
flora and fauna terms for species endemic to the island: polop ‘bush rat’, marai 
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‘ginger’, yguloy ‘lizard species’; land technology items like plam ‘bow string’, narabu 
‘bread’, kuy ‘mortar’, ktimbi ‘housepost’, aro ‘digging stick’; and trading words like 
psene ‘bundle’, yo ‘collect’, ropon ‘count’, aikupa ‘pig without a tail’. In this last class, 
trading contexts, a couple of verb roots have been borrowed; other verb roots bor- 
rowed include: wilala ‘blow’, ut ‘adopt’, tok ‘shake’, kor ‘sweep’, yguru ‘wash’. Words 
borrowed from Mangap-Mbula and other closely related Austronesian languages 
of the area into Kovai also include technology: napagas ‘axe’, pe(1)pel ‘basket’, kulambu 
‘pot’, pon ‘gun’; and trade items: bu ‘betelnut’, ge ‘pig’, to ‘sugarcane’, goun ‘dog’, 
urat ‘work’. But, surprisingly, quite a lot of basic vocabulary has been borrowed 
into Kovai from neighboring Austronesian languages: sosou ‘beach’, suyon 
‘breast’, oz ‘day’, monon ‘fat, grease’, longon ‘inside’, aling ‘language’, laun ‘leaf’, 
egon ‘leg’, atnon ‘liver’, nan ‘mother’, abal ‘mountain’, aun ‘mouth’, bong ‘night’, 
kut ‘louse’, ampiti ‘star’, zongen ‘tooth’. 

Much of the methodology of comparative linguistics depends on an assump- 
tion of the resistance of basic vocabulary items to borrowing and hence their reli- 
ability for the establishment of regular sound correspondences. While they may 
or may not have some resistance to borrowing, clearly in the New Guinea region 
basic vocabulary items are certainly not immune. In addition to Kovai consider 
Watam, a language of the Lower Ramu family. It too exhibits borrowing from 
the neighboring Austronesian languages of the Schouten Islands in basic vocabu- 
lary: namot ‘man’, wain ‘woman’, kiau ‘dog’, jim ‘cloud’, was ‘wind’. From the 
adjoining Lower Sepik language Kopar, Watam has borrowed giramot ‘three’ and, 
perhaps most surprising of all, arum ‘water’. 

The borrowing of bound morphemes is much less common than that of lexical 
items, again clearly for reasons of metalinguistic awareness. Being bound mor- 
phemes rather then free forms and commonly having grammatical meanings rather 
than clear reference, they score much lower in terms of Silverstein’s (1981) met- 
rics. Still there are a few plausible cases of this attested in the New Guinea region. 
Iatmul has a future tense suffix -kia which has probably been borrowed into a 
number of adjoining closely related languages, but, most strikingly, into geo- 
graphically distant and genetically unrelated Yimas, where it marks not only near 
future tense but, due to the peculiarities of the Yimas day reckoning system, also 
nighttime (see Foley 1991). Reesink (1998) notes a verbalizer bi- in the West Papuan 
language Abun likely derived from the nearby Austronesian language Biak, 
and Dol (1999) describes an instrumental nominalizer po- in Maybrat, again very 
likely from Proto-Austronesian “paN-. In general, as these examples indicate, the 
borrowing of bound morphemes is sporadic rather than systematic, as might 
be surmised from their low degree of metalinguistic awareness, and these few 
examples do support the oft-repeated claim that bound morphemes are relatively 
resistant to borrowing. There is, however, at least one likely case of more exten- 
sive and systematic grammatical borrowing. In the lower Mamberamo River area, 
the bound subject pronominal prefixes are formally identical across two neigh- 
boring, but unrelated, languages, Warembori and Kauwera of the Kwerba family 
of languages (Donohue 1999). They are in addition strikingly similar to those of 
the Austronesian languages of Cenderawasih Bay, suggesting diffusion of bound 
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morphemes across three unrelated language groups, or two if Warembori ulti- 
mately turns out to be an Austronesian language. 


4 Metatypy 


Metatypy, or the restructuring of one language on the basis of another, can affect 
all parts of the linguistic system: lexicon, phonology, grammar, and discourse. 
Lexical metatypy, sometimes better known as calquing or loan translation, occurs 
when the components, semantic or formal, which constitute a lexical item in one 
language are translated bit by bit into another language using its own native 
resources. Ross (1996) gives a number of examples of lexical metatypy between 
Austronesian Takia and Papuan Waskia, both spoken on Karkar Island:! 


(1) a. Takia: bani-g ate-n Waskia: a-gitiy gomay 

hand-1SG liver-3SG 1SG-hand liver 
‘palm of my hand’ 

b. Takia: mala-g i-kilani Waskia: motam gerago-so 
eye-my 3SG-go.round eye go.round-3SG.PRES 
‘Tm dizzy.’ 

c. Takia: awa-n yu-tale Waskia: kuriy batugar-so 
mouth-his 1SG-cut mouth cut-1SG.PRES 
‘T disobey him.’ 

d. Takia: tamol-pein Waskia: kadi-(i)met 
man-woman man-woman 
‘person’ 


Because phonological metatypy is rather less commonly discussed, I will provide 
a more extensive exemplification of this in three distinct cases. Watam has a 
single dorsal voiceless stop /k/ in phonemic contrast with the other voiceless stops 
and with a voiced stop /g/ and a voiced prenasalized stop /ng/. This voiceless 
stop has three allophones, distributed according to the following rule: 


2) fkh -S.- Igl FV 
[-front] 
[-high] 
le tv 
[-front] 
[-high] 
[k] / elsewhere 


The [?] allophone is a farther back realization of the [q] allophone in postvocalic 
position. The [q] allophone is also possible here, but is now uncommon; a sound 
shift of [q] > [?] is now well underway. Manam, an Austronesian language 
spoken on the island of the same name and whose villages are long-term 
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trade partners of the Watam, inherited from Proto-Austronesian a phoneme /q/ 
which is in the process of shifting to /?/ (Lichtenberk 1983). At the time of 
Lichtenberk’s fieldwork in the mid 1970s, speakers aged 15-50 used /?/, while 
older speakers used /q/, and the variation was so salient to Manam speakers that 
they had names for speakers of both variants. It seems quite likely that the shift 
of the [q] allophone of /k/ to [?] in Watam has actually metatypically diffused 
from Manam, although here it only affects the allophonic realizations of the phoneme 
/k/ and only in postvocalic position, rather than the phonemic inventory itself 
as in Manam. 

A more extensive case of phonological metatypy is exemplified in the upper 
Karawari River region, across languages of no less than four families: the Lower 
Sepik language Yimas, the Sepik language Alamblak (Bruce 1975; 1984), the 
isolate Arafundi, and the Engan language Enga. Both Yimas and Alamblak have 
productive phonological rules which create palatal consonants from apical ones 
in the environment of high front vowels or glides. The derived status of the palatals 
is indicated by the fact that the large majority of occurrences of these sounds is 
at morpheme boundaries, and, at least in Yimas, by phonotactic constraints: of 
the three palatal sounds, only /1/, the palatal lateral, can occur word-finally; 
the other two, /c/ and /n/, are found exclusively in onset positions or in the 
cluster /pc/. Some examples follow: 


(3) a. Yimas: tay- ‘see’ + -nak IMP > tapak ‘look!’ 
awykwi- ‘sink’+ tpay- ‘bathe’ > awykwepay ‘bathe in river’ 
awi ‘axe’ + -ntmpt PL > awpcmpt ‘axes’ 
b. Alamblak: yawi- ‘dog’ + -t FEM > yawf ‘female dog’ 
bari ‘hornbill’ + -t FEM > barf ‘hornbill’ 


hay- ‘ironwood’ + doh- ‘canoe’ > hadoh- ironwood canoe’ 


The high productivity of the palatalization rule in these two languages and these 
sounds’ phonotactic restrictions suggests that they are relatively recent develop- 
ments, probably more so in Yimas than in Alamblak, as there exist a few mini- 
mal pairs in the latter language (Bruce 1975): fuh- ‘fall’ versus sui- ‘grass skirt’, 
yuyg- ‘sound’ versus nuygwa- ‘bird’, dzing- ‘basket for insects’ versus dif- ‘white 
soil’. The historically derived status of palatals in Yimas is further indicated by 
the fact that its closest relative, Karawari, has apical stops or fricatives, often adjoin- 
ing a high front vowel, corresponding to palatals in Yimas (remember /1/ in Yimas 
is a palatal lateral): 


(4) Karawari Yimas 
‘urine’ snd net 
‘getup’ kwas- kwalca- 
‘put.down’ wus- wul- 
‘feces’ mnti mlm 
‘verandah’ pariapa palapa 


man panmari panmal 
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In the other two languages, Arafundi and Enga, palatal consonants are deeply 
entrenched phonemes, a clear indication of their antiquity. In Arafundi, they are 
not the result of a productive phonological rule, but are base phonemes. They are 
not more common at morpheme boundaries than stem -internally and can occur 
anywhere in the word, including word-finally: kac ‘sago grub’, ec ‘tree’, andimun 
‘afternoon’, kamn ‘two’. They are also found in independent pronouns and in 
verbal subject agreement suffixes, again clear evidence of their underived status: 
ac ‘1DL’, nin ‘2DL’, -pen 3SG.PROG, -71 2/3DL.PERF. There are very good grounds 
to conclude that palatalization has diffused via language contact from Arafundi 
and/or Enga into Alamblak and Yimas. As all of these communities have exten- 
sive trading contacts and some intermarriage, albeit marginal in these endogam- 
ous cultures, this is not too surprising. What we now see in the upper Karawari 
River area is something of a “palatalization zone,” not unlike what is found among 
the Slavic languages, but here across four unrelated languages. 

For a final, quite striking example of phonological metatypy, consider the case 
of Madak, an Austronesian language of central New Ireland (Ross 1994). All 
Austronesian languages are intrusive in New Ireland, having arrived in the last 
4,000 years or so into an area already occupied by speakers of Papuan languages 
for some 40,000 years. Austronesian languages of the region tend to have little 
phonetic variation in the realization of their voiceless stops, typically realizing 
them as fortis, unaspirated. Papuan languages, on the other hand, tend to exhibit 
very extensive processes of phonetic lenition of stops between vowels. Hence, 


typically: 
6) p > f 


Madak is spoken in a territory adjoining Kuot, the sole surviving Papuan lan- 
guage of New Ireland, and it exhibits extensive patterns of lenition, paralleling 
those of Kuot (Ross 1994): 


(6) a. Madak pas ‘taro’ tat ‘hand-basket’ kalyi ‘head-basket’ 
la- =‘the’ J J 1 
la-vas la-rat la-yalyi 
b. Kuot naip ‘skin’ kit ‘fire’ kakok ‘snake’ 
L 1 J 
naiv-a ‘his skin’ kir-ip ‘fires’ kayoy-up ‘snakes’ 


A likely explanation for this anomalous sound process in an Austronesian lan- 
guage is that present-day Madak speakers were originally Kuot (or a closely related 
language) speakers who shifted their language from Kuot to the more prestigious 
invading Austronesian language (remember all currently spoken languages on 
New Ireland, save Kuot, are Austronesian). But as they did so, they transferred 
the phonological rules of their original Papuan language into the Austronesian 
language being adopted. Phonetic habits like lenition are deeply entrenched and 
lie well below the threshold of metalinguistic awareness. Unlike the previous two 
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cases of phonological metatypy which could be argued to be effects of conscious 
adoption of the phonological patterns of languages spoken by valued trading 
partners (e.g. replace [q] with [?]), much as a snobbish American might affect an 
RP-style pronunciation of English, the systematic nature of the Madak case 
makes this implausible. It seems more likely, as Ross (1994) argues, that lenition 
is the result of imperfect second language learning by earlier generations, when 
they multilingually spoke both their Papuan vernacular and the Austronesian 
ancestor of Madak, but the latter with a marked “Papuan accent,” due to inter- 
ference from their native Papuan vernacular, not unlike how English-speaking 
learners of French incorrectly aspirate voiceless stops in that language. Because 
this imperfectly learned language was the model of the spoken language for suc- 
ceeding generations as the villages shifted from their original Papuan vernacular 
to the Austronesian one, lenition of stops became established as a pervasive phono- 
logical feature for it, a typical Papuan trait, but unusual for an Austronesian 
language. 

Grammatical metatypy is much more common than phonological metatypy, not 
only in the New Guinea region, but also globally. The overall superficial typo- 
logical similarity of many of the Papuan languages throughout the region and 
the even greater similarities between unrelated languages in many sub-areas within 
it are plausibly the result of long-term grammatical metatypy. We can distinguish 
different types of grammatical metatypy according to the level of grammar affected: 
morphology, intraclause level syntax, interclause level syntax. Morphological 
metatypy is very widespread in New Guinea and is particularly diagnostic of smaller 
linguistic sub-areas. Perhaps one of the most striking examples is a two-gender 
system, masculine versus feminine, with feminine as the unmarked gender. 
Although the formal realizations differ widely, this paradigmatic opposition is 
found throughout the northern lowlands of mainland New Guinea and adjacent 
areas, through the central hub of the island and down into the south coast of Papua, 
across many distinct linguistic families. Another example are switch reference 
systems, which monitor the sameness versus difference of the referents of the 
subject NPs of succeeding clauses, which have diffused widely across the central 
highlands of New Guinea and adjacent areas. 

Within smaller areas of intense language contact, we see more intimate exam- 
ples of morphological metatypy. Both Yimas and Alamblak (Bruce 1984) exhibit 
a cross-linguistically unusual type of verb serialization in which verb roots are 
juxtaposed next to each other, but within a single word, so that all verbal 
inflections flank the juxtaposed verb roots as prefixes and suffixes (in Alamblak 
most affixes are suffixes). Essentially verb serialization in these languages looks 
like verb compounding: 


(7) a. Yimas: narm pu-tpul-kamprak-r-akn 
skin 3PL-hit-break-PERF-3SG 
‘They hit and broke his skin.’ 
b. Alamblak:  kéfra-e feh-r tu-finah-an-r 
spear-INST pig-MASC throw-arrive-1SG-3SG.MASC 
‘I speared a pig.’ 
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Adverbial modifiers are incorporated into the verb, like verb roots in the serial 
verb constructions: 


(8) a. Yimas: paygra-na-kwanan-kulanay 
1PC-PROG-aimlessly-walk.about 
‘We are walking about aimlessly.’ 
b. Alamblak: yén-r nur-nhen-mé-r 
child-MASC cry-feignedly-REM.PAST-3SG.MASC 
‘The boy cried feignedly.’ 


As are all derivational affixes such as applicatives or causatives; in fact, both Yimas 
and Alamblak use the same verb root ‘hold/get’ as a direct causative marker: 


(9) a. Applicatives: 

Yimas: uray — pu-tay-yawra-t-akn 
coconut 3PL-BEN-pick.up-PERF-3SG 
‘They picked up a coconut for him.’ 

Alamblak: na yima-m — wikna-hay-mé-an-m 
1SG person-PL buy-BEN-REM.PAST-1SG-3PL 
‘I bought (it) for the people.’ 

b. Causatives: 

Yimas: na-ka-tal-kwalca-t 
3SG-1SG-CAUS (‘hold/get’)-arise-PERF 
‘I woke him up.’ 

Alamblak: yarmutha-t kak-kkah-mé-t-a 
blanket-FEM CAUS (‘hold/get’)-hot-REM.PAST-3SG.FEM-1SG 
‘The blanket made me hot.’ 


The effect of morphological metatypy between Yimas and Alamblak has produced 
a similar complex polysynthetic profile for these two unrelated languages. None 
of their close genetic relatives is as polysynthetic; the forces of metatypy have driven 
them to be more alike each other in this aspect of their verbal morphology, but 
less like their own relatives. 

The processes of metatypy at the intraclausal level of syntax are well exemplified 
by basic word order changes that have taken place as a result of contact between 
Austronesian and Papuan languages. Proto-Austronesian and its immediate 
descendants in the New Guinea region like Proto-Oceanic were left-headed lan- 
guages with verb-initial or verb-medial word order within clauses, as exemplified 
by Tolai (Mosel 1984): 


(10) Tolai: a tutanai kita ra pap ta muru-na-davai 
Dman_ 3SG hit D dog at back-of-tree 
‘The man hit the dog at the back of (= behind) the tree.’ 


Note the clear left-headed structure of this Tolai clause: the determiner occurs first 
in the DP, the P is a preposition, and the verb is initial in the VP (to the extent 
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that this can be motivated in the language). Most, but not all, Papuan languages 
have the reverse right-headed structural typology, with preferred verb-final 
word order, as in Watam: 


(11) Watam: namot an padoy mbin  kiau an yga-ruye-ri 
man D tree behind dog D FOC-hit-PAST 
‘The man hit the dog behind the tree.’ 


Here, the D occurs last in the DP, the P is a postposition, and the verb is clause- 
final and indeed VP-final (see Foley 1999 for justification of a VP in Watam). For 
many Austronesian languages of the northeastern and southeastern coastal areas 
of New Guinea, their basic syntactic profile has shifted from the earlier left-headed 
typology to the right-headed typology of Papuan languages. Takia (Ross 1996) is 
one such language: 


(12) Takia: tamol an yai i-fun-ag-da 
man D 1SG 3SG-hit-1SG-IMPERF 
‘The man is hitting me.’ 


As is Manam (Lichtenberk 1983), which in keeping with Papuan languages, has 
lost the D category and with it DPs, having NPs instead: 


(13) Manam: tamoéata boro di-taotdon-i 
man pig 3PL.R-hunt-3SG 
‘The men were hunting the pig.’ 


(That Watam has DPs is in fact unusual for Papuan languages, most of which 
have NPs in preference to DPs, and given the pervasive Austronesian influence 
on this language, this feature is itself quite probably the result of language con- 
tact with and diffusion from Austronesian languages; note that the Watam D is 
homophonous with that in Takia, an, and itself may be an Austronesian loan, 
although it has been thoroughly nativized and forms its plural in the normal Watam 
fashion: and). 

Cases of switches in the opposite direction, in which right-headed Papuan lan- 
guages shift under contact pressure to the left-headed profile of the Austronesian 
languages are also attested, although they appear less common, a fact itself 
which seeks an explanation. The best-known case is in the West Papuan family 
of languages, spoken discontinuously around the area of the northern half of 
Halmahera Island (these languages subdivide into two branches, West Makian 
and Northern Halmahera, the latter of which in turn divides into three branches, 
Sahu, Ternate-Tidore and Northeast Halmahera) and the extreme west of the Bird’s 
Head Peninsula in Papua. The very closely related languages of the Northeast 
Halmahera branch have the typical right-headed typology of Papuan languages, 
here illustrated by Galela (Voorhoeve 1994): 
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(14) Galela: 
o nyawa i-tagi de o maijanga moi y-a-make 
Dman_ 3SG-go and D deer one 3SG.MASC-3SG-see 
‘A man went and saw a deer.’ 


0 gota ka po-supu 
D tree in 3SG.FEM-climb 
‘She climbed a tree.’ 


Note that these Northeastern Halmahera languages are not completely consistently 
right-headed: DPs as above are in fact left-headed as in Austronesian languages, 
but these languages do have postpositions and the verb-final constituent order 
(although this is variable) of right-headed languages. There are good reasons to 
believe that the Northeast Halmahera languages are conservative here and reflect 
the typological profile of Proto-West Papuan, as there are no other right-headed 
languages in their region. Having said that, every other branch of the West Papuan 
family, West Makian, Sahu, Ternate-Tidore and the languages of the west Bird’s 
Head Peninsula, are left-headed, as in Moi from the west Bird’s Head (Menick 
1996): 


(15) Moi: 
wi-sik 00 p-osu wi-gik p-ana_ lun 
3SG.MASC-put banana 3SG-to 3SG.MASC-mouth 3SG-go inside 
‘He put a banana into his mouth.’ 


West Bird’s Head languages like Moi lack the category P, but use locational verbs 
in serial verb constructions (they are marked as verbs by the presence of the sub- 
ject agreement prefixes) to indicate notions typically expressed by these; this too 
is a quite common trait for Austronesian languages of the New Guinea region. 
Note that the structure of this Moi clause is strictly left-headed: the locational verbs 
precede their complements. 

One of the most striking examples of grammatical metatypy at the level of inter- 
clausal syntax is the diffusion along the north coast of New Guinea across lan- 
guages of diverse families, Austronesian and Papuan, of a quite specific pattern 
of clause chaining. Clause chaining is a structural pattern in which clauses 
headed by morphologically stripped down verbs precede ones headed by a fully 
inflected verb, from which the previous verbs take their specifications for these 
inflectional features. Watam of the Papuan Lower Ramu family illustrates: 


(16) Watam: 
waut nakani mbo ga-r sayga-r 
stone big a LOC climb-R go-R 


timoy an yg-utki-r ak-ri 
on.top this FOC-stand-R call out-PAST 
‘(He) went and climbed up a big rock and stood on top of it and called out.’ 
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In this example only the final verb is specified for tense, by -ri PAST. The previ- 
ous verbs simply mark their dependence on the final tense inflection, i.e. they 
are also to be taken as past tense, and realis status through the suffix -r. Clause 
chaining or structures very much like it such as converbs (Haspelmath & Konig 
1995) are overwhelmingly a feature of right-headed languages cross-linguistically. 
While, as we have seen, many of the Austronesian languages of the north coast 
of New Guinea have shifted from an earlier left-headed typology to a right-headed 
one through long-term contact, a few of the Madang region have gone further 
and innovated a clause-chaining pattern typical of right-headed languages and 
pervasive among Papuan languages. Takia and Gedaged (Ross 1987) exemplify 
this development: 


(17) Takia: in i-marsi-go fud 1-ant a 
3SG 3SG.R-sit-R banana 3SG.R-eat R 
‘He sat and ate a banana.’ 
Gedaged: 0  u-seg-me-g u-nasi-lak 
2SG 2SG.R-come-SIM-R 2SG.R-see-PERF 
‘While you were coming, you saw.’ 


But now the plot thickens. Most Austronesian languages of this region lack true tense 
distinctions; rather the basic categories of verb inflection contrast realis (R) versus 
irrealis (IRR) in the subject agreement prefixes, as in Manam (Lichtenberk 1983): 


(18) Manam:  u-pile m-pile 
1SG.R-speak 1SG.IRR-speak 
‘T spoke, am speaking.’ ‘T will speak.’ 


In the clause-chaining right-headed Austronesian languages of the Madang 
coast, this contrast in the subject agreement prefixes is lost in favor of it being 
marked by sentence-final postverbal affixes or clitics, again as in Papuan languages. 
The contrast is also marked on the dependent verbs in clause-chaining structures: 


(19) Takia: iy i-marsi-go fud i-ani a 
3SG 3SG-sit-R banana 3SG-eat R 
‘He sat and ate a banana.’ 


in i-marsi-pe fud i-ani wa 
3SG 3SG-sit-IRR banana 3SG-eat IRR 
‘He will sit and eat a banana.’ 


These two suffixes which mark dependent verbs in clause-chaining structures in 
Takia, -go R and -pe IRR are old Austronesian conjunctions (mbe is the conjunc- 
tion ‘and’ in Manam) which have been redeployed for clause-chaining structures. 
They are needed to realize what is an old and robust Austronesian verbal 
inflectional contrast in status, but do so now in a very non-Autronesian and Papuan 
way, by clause-chaining suffixes. 
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But the plot thickens still more. Papuan languages generally contrast with 
Austronesian ones in having quite rich systems of tense distinctions, and this is 
no less true of the Papuan languages of the Madang coast. As expected, in clause- 
chaining structures only final verbs get marked for tense, but remarkably, depen- 
dent verbs make a systematic contrast between realis and irrealis, exactly as in 
the Austronesian languages like Takia. This is true across a wide range of Papuan 
language families along the Madang coast, here illustrated by the Trans New Guinea 
language Bargam (Hepner 1995) and the Lower Ramu language Watam: 


(20) a. Bargam: 
leh-adi —_ ekton-y-augq 
go-R 1PL yell-IMPERF-1PL 
‘As we were going, we were shouting.’ 


ni leh-eq i  ninmen  karuw araq wil em-O-am 
2SG go-IRR 1PL 2SG.DAT meat a_ hit do-FUT-1PL 
‘If you go, we will kill a pig for you.’ 


b. Watam: 
ma_ birka-r nayas 4g-amb-ri 
3SG sit-R banana FOC-eat-PAST 
‘He sat and ate a banana.’ 


ma birak-mbe nayas g-am-na 
3SG sit-IRR banana FOC-eat-FUT 
‘He will sit and eat a banana.’ 


It is more than likely that while clause chaining has arisen in the Austronesian 
languages along the Madang coast through grammatical metatypy due to long-term 
language contact with Papuan languages, in which these structures are endemic, 
the actual patterns in clause chaining in this area, which oppose realis- and 
irrealis-dependent verb forms, have in turn diffused outward from Autronesian 
languages to Papuan ones through language contact stemming from the intensive 
trading contacts along this coast. One very strong piece of evidence for this con- 
clusion is the form of the Watam dependent verbal suffix for irrealis, -mbe, with- 
out question a borrowing from Austronesian; note the Takia dependent verb suffix 
for irrealis is -pe. Here on the northeast coast of New Guinea we see how com- 
plex the effects of metatypy through language contact can be: a clear ping pong 
effect, of Papuan to Austronesian and then Austronesian back to Papuan again. 


5 Trade Jargons and Pidgins 


Besides borrowing and metatypy, there is a third possible effect of language 
contact in the New Guinea region: a specific trade jargon or pidgin language. This 
is most common in areas where knowledge of the language of the economically 
dominant group in an ongoing trade relationship is withheld for various 
ideological reasons. Williams (1993) provides a summary of the attested cases of 
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trade pidgins in the New Guinea region; here I will only discuss the complex situ- 
ation of pidginization in the upper Karawari River region. This case is simply 
one of a number of trade pidgins in the Sepik region. All Sepik villages engage 
heavily in commercial trade relationships with neighboring villages. Even if a 
village could theoretically be self-sufficient subsistence-wise, it is not, but is 
engaged in long-term intergenerational trading relationships with its neighbors. 
Many items were traded traditionally, but within the riverine Sepik region, the 
main trade transaction was the carbohydrate staple sago for the protein staple 
fish (Gewertz 1983). The protein staple fish was, of course, more highly valued, 
and those villages with ecologically advantageous riverine or lacustrine envi- 
ronments for catching fish, such as the Iatmul, Chambri, Yimas, etc, would 
exchange that for sago produced by their comparatively disadvantaged grasslands 
or rainforest /swampland-dwelling neighbors. 

The language of these secular exchanges was a pidgin language using the lan- 
guage of the fish producers as superstratum. Such trade-based pidgin languages 
have been reported in the Sepik region for the Iatmul, Manambu, and Yimas, but 
only in the last case is there any significant documentation of the pidgin language(s). 
While Yimas may be the dominant lexifier, the superstrate, of the pidgin languages, 
they all contain elements of the other, substrate, languages spoken by the sago 
suppliers. 

Yimas village traditionally had their main trading relationships with villages 
speaking three languages, the closely related Karawari, and the unrelated 
Arafundi and Alamblak; these last two also unrelated to each other. A trade 
pidgin was used in all three trade encounters; it was an index of this kind of 
secular exchange between villages. The pidgins themselves were the property of 
the clans that had the rights to trade with these other groups. Table 39.1 provides 
a short comparative lexicon between Yimas and two of its lexified pidgins, one 
for the Arafundi speakers, and the other for the Alamblak speakers (the Yimas-— 
Alamblak Pidgin data are drawn from Williams (2000); the rest are data from my 
fieldwork). Both pidgin languages exhibit a mix of Yimas and substrate lexical 
elements. The Yimas percentage is higher in the Arafundi pidgin and significantly 
lower in the Alamblak pidgin. The non-Yimas lexicon in Arafundi is of Arafundi 
origin, but most of the non-Yimas forms in the Alamblak pidgin are not in fact 
from Alamblak, but from Karawari, and identical to the lexical forms in the 
Yimas-Karawari pidgin. 

Like all pidgins, the Yimas-based ones show structural simplification from 
the superstrate language. Yimas (and Karawari) are morphologically complex 
polysynthetic languages with multiple agreement for NP arguments; there is no 
case marking. Consider these transitive verbs: 


(21) Yimas: pu-ka-tay pu-ya-tay 
3PL-1SG-see 3PL-1SG-see 
‘Tsaw them.’ ‘They saw me.’ 
pu-n-tay na-mpu-tay 
3PL-3SG-see 3SG-3PL-see 


‘He saw them.’ ‘They saw him.’ 
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Table 39.1 Lexicons of Yimas-based Pidgins 


Yimas Arafundi Pidgin Alamblak Pidgin 
‘man’ payum payum yenmisinawt 
‘woman’ yaykum aykum yerimaywi 
‘village’ num kumbut yimuyga 
‘betelnut’ patn patn yabu 
‘pig’ numpran numbrayn yimbian 
‘sago’ tupwi tupwi sipi 
‘cassowary’ awa karima awa 
‘basket’ impram yamban yamban 
‘water’ arém yim méray 
‘tobacco’ yaki yaki yagi 
‘canoe’ kay kay kay 
‘flying fox’ kumpwi aringum kumbut 
T ama ama apia 
‘you’ mi mi mi 
‘he/she’ min min masaygum 
‘talk’ malak- mariawk- mariak- 
‘give’ ya- asa- (<Y aca- ‘send’) seri- 


The first person singular forms show distinct forms ka- 1SG for the subject of the 
transitive verb tay- ‘see’ and ya- 1SG for the object. The third person forms also 
distinguish a prefix n- 3SG for the subject of the transitive verb from a form 
na- 3SG for the object. 

All this complex agreement morphology is lost in the pidgins, which, like 
pidgins generally, are essentially isolating languages. However, neither of the 
two pidgins, the Arafundi one or the Alamblak one, signal grammatical functions 
in the same way, and neither of these involve the use of fixed word order, so 
typical of pidgins elsewhere in the world. Consider the system for marking 
grammatical relations in the Arafundi pidgin: 


(22) Yimas-Arafundi Pidgin: tupwiminam-bi  ta-nan 
sago 3SG eat-DEP PROG-NON-FUT 
‘He’s eating sago.’ 


ama min namban krattki-nan 
1SG 3SG DAT hit-NON-FUT 
‘IT hit him.’ 


The dative postposition namban comes from the Yimas allative postpostion 
nampan ‘toward’. In the first example, the two arguments of the transitive verb 
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am- ‘eat’ differ in animacy; the subject is animate as usual, and the object, inani- 
mate. In such cases no overt disambiguating marker is necessary or appears. 
However, in the second example, both arguments of the transitive verb, kraték#- 
‘hit’ from Yimas kratk- ‘fight’, are animate; in such cases the dative postposition 
namban, which marks the typically animate recipients of ditransitive verbs like 
‘give’, is pressed into service for signaling the animate object. Neither source 
language has this feature; both use verb agreement to express both arguments of 
transitive verbs. 


6 Conclusion 


As can be gleaned from the case studies in this chapter, New Guinea is an 
ideal region in which to study language contact. The very high genetic diversity 
there and the sharp typological contrast between Austronesian and Papuan lan- 
guages provide many opportunities to study many different types of language 
contact situations and to test theories about mechanisms of and constraints 
on the diffusion of linguistic traits across languages in contact. But, sadly, 
researchers need to move fast because these conditions will not last much longer: 
the languages of the New Guinea region are gravely endangered and many 
are disappearing quickly. This unique laboratory of language contact will soon 
be closed. 


NOTE 


1 Abbreviations used in this chapter: 


1 first person LOC locative 

2 second person MASC masculine gender 
3 third person NON-FUT non-future 

BEN benefactive NP noun phrase 
CAUS causative PAST past tense 

D determiner PC paucal number 
DAT dative PERF perfective 

DEP dependent verb PL plural number 
DL dual number PRES present tense 

DP determiner phrase PROG progressive 

FEM feminine gender R realis 

FOC focus REM.PAST remote past tense 
FUT future tense SG singular number 
IMP imperative SIM simultaneous 
IMPERF imperfect 

INST instrumental 


IRR irrealis 
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40 Contact Languages of 
the Pacific 


JEFF SIEGEL 


With over 1,000 indigenous languages and a recent history of colonial exploita- 
tion, the Pacific region has provided a fertile context for the growth of contact 
languages. This chapter first describes new languages (pidgins and creoles) and 
then new dialects (koines and indigenized varieties) that have emerged in the Pacific 
as the result of language contact. For the purpose of this chapter, the Pacific region 
is defined as including only small island countries and territories (thus exclud- 
ing Australia, New Zealand, and the countries of Asia and the Americas that 
border the Pacific Ocean). Also, due to space limitations, the chapter concentrates 
only on lexicon and morphosyntax. 


1 New Languages: Pidgins 


Pidgins are new languages that develop out of a need for a medium of commu- 
nication among people who do not share a common language — for example, 
between traditional trading partners or among plantation laborers from diverse 
geographic origins. Most of the forms in the lexicon of the new language come 
from one of the languages in the contact situation, called the “lexifier” (or some- 
times the “superstrate”) — often the language of the group in control of the area 
where contact occurs. However, the meanings or functions of some of the lexical 
forms of the pidgin, as well as the phonology and grammatical rules, are differ- 
ent from those of the lexifier. First, they appear to be greatly reduced or simplified 
in comparison, and second, they may resemble those of one or more of the other 
languages in contact, usually referred to as the “substrate languages.” 

The first step in the development of a pidgin is when people use their own indi- 
vidual ways of communicating. Speakers of the lexifier may simplify their lan- 
guage or use a simplified register, referred to as “foreigner talk.” Speakers of other 
languages use words and phrases they have learned from the lexifier, but with 
little if any grammatical morphology, or with overgeneralized rules, as in the early 
stages of second language acquisition. The combination of these individual ways 
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of communicating in which some conventions have emerged is called a “jargon” 
or “pre-pidgin.” 


1.1 Indigenous pre-pidgins 


Three pre-pidgins lexified by indigenous languages have been recorded in the 
Pacific. Jargon Fijian (Siegel 1987) was derived in part from the foreigner talk that 
Fijians used to communicate with Tongans and other neighbors (Geraghty 1978). 
Simplified Motu (Dutton 1985; 1997) was used by the Motu people on the south 
coast of New Guinea around Port Moresby to talk to visitors or trading partners 
from other language areas. Both of these pre-pidgins were later used for com- 
munication with European explorers, traders, beachcombers, or missionaries — Jargon 
Fijian from the early 1800s (Siegel 1987) and Simplified Motu from 1874 (Dutton 
1985; 1997). 

The third indigenous pre-pidgin, Hawaiian Maritime Pidgin (Day 1987) or Jargon 
Hawaiian (Siegel 1987; Roberts 2005), emerged after 1778 when Hawai’‘i became 
a stopover for large numbers of American and European ships involved in the 
fur trade between the northwest coast of America and China. The use of this 
pre-pidgin was extended to the sandalwood trade in the early 1800s and then to 
the whaling industry in the 1820s. By the 1840s and 1850s hundreds of American 
and European ships stopped in Hawai‘i each year and the Hawaiian pre-pidgin 
had been spread around the Pacific by sailors, who included many Hawaiians 
(Roberts 1995).' 

The linguistic features of these pre-pidgins have been ascertained from a 
variety of historical sources, including word and phrase lists collected by early 
explorers and the writings of the first missionaries (see Dutton 1985; 1997; Day 
1987; Roberts 1995; 2005). Here they are exemplified by Jargon Fijian. 

The features of Jargon Fijian included the use of only independent forms of pro- 
nouns (INDEP), rather than subject referencing (SRP) or possessive (POSS) pronouns; 
the overgeneralized use of sa (based on a Fijian aspect marker s@) as a predicate 
marker (PM); a lack of verbal morphology (such as transitive marking); and the 
use of adverbs such as malua ‘slowly, by and by’ instead of tense or aspect 
markers (Siegel 1987: 115-19).* An example is: 


(1) maluasa laGo mai koeau. 
later PM go DIR 1SG.INDEP 
‘Tl come later.’ 


In standard Fijian (SF), this would be: 


(2) au na lako mai e muri. 
1SG.SRP FUT go DIR LOC following 


However, as is generally the case in pre-pidgins, these features were not found 
consistently, and there was a great deal of individual variation — for example in 
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basic constituent ordering between SV and VS, and in the ordering of the pos- 
sessor and possessum in genitive constructions. 


1.2 Stable pidgins 


If groups in contact continue to use a pre-pidgin with each other, and especially 
if speakers of other languages start to use the pre-pidgin as a lingua franca among 
themselves, norms begin to stabilize. Individual variation may decrease, some vari- 
ants may be eliminated, and further communicative conventions may develop. 
The end result is a “stable pidgin.” This process of stabilization began in Fiji as 
a result of the introduction of small, European-owned plantations in the 1860s. 
Jargon Fijian was used in the earlier plantations, where the laborers were Fijians. 
However, from 1865 to 1911, 27,000 indentured laborers were imported from islands 
in the New Hebrides (now Vanuatu), the Gilberts (now Kiribati), the Solomons, 
and New Guinea. Jargon Fijian continued to be the language used to run these 
plantations, but it also became the lingua franca among the Pacific laborers, who 
spoke scores of different languages. Thus, in the late 1860s and early 1870s, Jargon 
Fijian stabilized to become Plantation Pidgin Fijian (PPF; Siegel 1987). 

Most of the features of PPF were similar to those of Jargon Fijian, except that 
they were used more consistently, and a great deal of the variation was eliminated 
— for example, constituent ordering stabilized as SVO and possessum possessor. 
Also, some additional features conventionalized, such as fusion of the third person 
inalienable possessive suffix onto nouns that are normally inalienably possessed. 
These features can be seen in the following examples of PPF, compared to stan- 
dard Fijian (SF): 


(3) a. PPF: na_ ligana koiau sa musi. 
ART hand 1SG.INDEP PM hurt 
b. SF: sa mosina  liga-qu. 
ASP hurt DEF hand-1SG.POSS 
‘My hand hurts.’ 


(4) a. PPF: kokoya sa musuna_ tabana. 
3SG.INDEP PM cut ART branch 
b. SF: e musu-ka na_ taba-na (0  koya). 
3SG.SRP cut-TR.OM DEF branch-3SG.POSS PRP 3SG 
‘He cut down the branch.’ 


Compared to SF, however, PPF still had virtually no grammatical morphology 
and demonstrated a great deal of reduction. For example, SF has between 70 and 
135 pronouns, depending on how they are counted. There are subject marking, 
objective and independent sets, and four different possessive sets, each set with 
pronouns indicating person, the inclusive/exclusive distinction, and number 
(singular, dual, paucal, and plural). In PPF, there were only six pronouns - first, 
second and third person, singular and plural. 
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Most of the Pacific Island laborers who survived their contracts in Fiji returned 
to their home islands. There are reports that in the late 1890s and early 1900s, 
Pidgin Fijian was being used as a lingua franca in the western Pacific, where most 
of the laborers returned to (Siegel 1992). However, this was soon displaced by an 
English-lexified pidgin (see below). Those Pacific Islanders who stayed in Fiji were 
most often married to indigenous Fijian women and their children spoke the usual 
Fijian of their communities, not PPF. But in addition to workers imported from 
Pacific Islands, from 1879 to 1916 over 60,000 indentured laborers were imported 
from India. About a third of these worked on small plantations alongside Pacific 
Islanders. There they learned PPF, and it became the language of interethnic 
communication both on and off the plantations. Unlike the Pacific Islanders, the 
indentured Indians came mostly as family groups, not single men, and when their 
contracts expired, the nearly 40 percent who stayed on did not integrate into the 
Fijian community and spoke Fiji Hindi (see below). However, they continued 
to speak Pidgin Fijian for interethnic communication, and the language is still 
spoken today for this purpose (Siegel 1987). Current Pidgin Fijian (CPF) spoken 
by Indo-Fijians has been influenced by their language and differs from PPF in 
possessor possessum ordering, optional SOV ordering, and the use of the SF demon- 
strative (0)q0 for the third person singular pronoun — for example: 


(5) a. CPF: koiau na tamanaogd sa kaDi. 
1SG.INDEP ART father 3SG PM call 
b. SF: e kaci-vi koya na tama-qu. 


3SG.SRP call-TR 3SG.OBJ DEF father-1SG.POSS 
‘My father called him.’ 


Another pidgin language also emerged from the plantation system in Fiji. 
From 1882, much larger plantations were established by the Colonial Sugar 
Refining Company, employing nearly all-Indian labor. The majority of Fiji’s 
Indian immigrants worked on these plantations, where the policy was to use 
Hindustani, the lingua franca of northern India (the vernacular form of what is 
now called Hindi). But many European overseers spoke a pidginized form of this 
language, including some features of Bazaar Hindustani, a pidgin spoken in Calcutta, 
the main port of embarkation for most of the laborers, who came from North India. 
After 1903, however, 42 percent of the laborers embarked from Madras and were 
from South India. They spoke Dravidian languages and were mostly not familiar 
with the Hindustani lingua franca. However, when they arrived on the plantations, 
they quickly had to learn the Jargon Hindustani spoken there to communicate 
with the overseers and the North Indian laborers. Thus, when the pre-pidgin was 
used as the medium of communication among the laborers themselves, it began 
to stabilize and became Pidgin Hindustani (Siegel 1987). This pidgin also came 
to be used with Fijians and other groups outside the large plantations, and today 
remains as a language of interethnic communication, alongside Pidgin Fijian. 

Back to other areas of the Pacific, the first missionaries to arrive in Port 
Moresby in 1874 were soon followed by visitors and settlers from diverse areas 
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such as China, Singapore, India, the Philippines, America, Europe, and various 
Pacific islands. The Motu people used Simplified Motu to communicate with these 
people as they had done with other outsiders. When this pre-pidgin became the 
lingua franca among the visitors and settlers themselves, a stable pidgin Motu 
began to emerge. In 1884, the Protectorate of British New Guinea was established, 
becoming a colony in 1888. The new colonial government adopted this form of 
Motu as the unofficial language of administration, and used it to establish law 
and order and government control throughout the colony. The Armed Native 
Constabulary was set up in 1890, and the association of the language with this 
police force led to the name Police Motu. As the language further stabilized, it 
was influenced by Papuan substrate languages, especially Koita and Kiwai, 
and by the pidginized English spoken by some of the original police recruits. 
Eventually Police Motu became the lingua franca among speakers of the many 
indigenous languages in the colony (for more details, see Dutton 1985; 1997). 

When the colony became the Australian Territory of Papua after World War I, 
Police Motu was still unofficially used for administration. With the start of World 
War II, it was given official recognition and codified for use in published and broad- 
cast media. In 1971, when it had approximately 150,000 speakers, the name was 
changed to Hiri Motu, because of the view (incorrect, as it turns out) that it was 
the direct descendant of the language the Motu people used for the hiri, tradi- 
tional annual trading voyages to the Gulf of Papua. In 1975, the Territory of Papua 
along with the Protectorate of New Guinea on the northern part of the island 
(also administered by Australia) became the independent country of Papua 
New Guinea, and Hiri Motu along with Tok Pisin (see below) became the two 
unofficial national languages. However, in recent years, the number of speakers 
of Hiri Motu has been decreasing, and it is being displaced by Tok Pisin. 

In Hawai‘i, as in Fiji, a stable pidgin emerged with the establishment of a 
European-run plantation system. In the earliest plantations, from 1835 to the 1850s, 
the laborers were Hawaiians, and Jargon Hawaiian was used as the plantation 
language. From 1852 to 1876, approximately 2,000 Chinese contract laborers were 
imported, and reports show that they also used this form of Hawaiian for 
communication (Roberts 2005). Some stabilization may have started during this 
period, but a few years later larger numbers of Chinese laborers were imported 
(approximately 37,000 from 1877 to 1897), as well as indentured workers from 
other countries, including Portugal (10,000 from 1878 to 1887), Germany (1,050 
from 1882 to 1884), and other Pacific islands (2,450 from 1877 to 1887). Thus, it 
seems clear that a stable Pidgin Hawaiian emerged in the late 1870s or early 
1880s when the pre-pidgin was used as a lingua franca among all these language 
groups on the plantations. Pidgin Hawaiian remained as the plantation language 
and an important lingua franca elsewhere until the early 1900s, when it was dis- 
placed by the developing English-lexifier pidgin (see below). 


1.3 Other indigenous pidgins 


Several indigenous trading pidgins existed before European contact in what is 
now Papua New Guinea. Two of these were the actual Hiri Trading Languages 
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used by the Motu people in the traditional hiri expeditions mentioned above. These 
were based on the languages of their trading partners: the Eleman languages, 
and Koriki (Dutton 1983; 1997). Several other trading pidgins, based on Papuan 
(non-Austronesian) languages have existed in the Sepik region of northern 
New Guinea. The best-known of these is Yimas Pidgin (Foley 1988). This is unlike 
other pidgins in that it has relatively complex morphology, though still not as 
complex as that of the main lexifier language, Yimas. Yimas Pidgin is also inter- 
esting in that it has multiple varieties, each used by a different clan with trading 
rights to villages speaking a particular language. These varieties include Yimas- 
Arafundi (with at least three separate dialects, depending on the village dialect 
of Arafundi), Yimas-Alamblak, and Yimas-Karawari (see Williams 1993; 2000). 
A now extinct Yimas-Iatamul variety existed as well. Other trading pidgins or 
pre-pidgins reported in the Sepik area are Jatamul Jargon, Kwoma Pidgin, Hauna 
Trade Language, and Arafundi-Enga Pidgin (see Williams 1993; Miihlhausler et 
al. 1996). In the Papua region, speakers of the Austronesian language, Mekeo, 
devised “trade jargons” to use in commercial transactions with non-Austronesian 
Kunimaipa speakers (Jones 1996). Similar to Yimas Pidgin, this Mekeo Trading 
Language had three varieties — Imunga, Ioi, and Maipa — each belonging to a par- 
ticular family with inherited trade contacts. 


1.4 Melanesian Pidgin 


The best-known and most widely spoken contact language in the Pacific is 
Melanesian Pidgin (MP), with over four million speakers. The first stage in the 
development of the language was in the early 1800s when Europeans (including 
Australians and Americans) began to have frequent contact with Pacific islanders 
as the result of whaling in the region, followed by trading in sandalwood and 
béche-de-mer. In some places, the Europeans attempted to learn the local language, 
as we have seen for Fiji and Hawai'i. But in others, they used foreigner talk, or 
existing contact varieties, such as New South Wales Pidgin English and Chinese 
Pidgin English, that they had learned through previous trading encounters (see 
Baker & Miuhlhausler 1996). Pacific islanders then picked up aspects of these vari- 
eties and also introduced their own second language versions of the English they 
were exposed to. As a result, a pre-pidgin emerged that was used across the Pacific. 
This is often referred to as South Seas Jargon (Clark 1979). Some features of South 
Seas Jargon are illustrated in following examples (from Clark 1979: 29-30, 37): 
(6) Only he got using all the same pigeon. (Gilbert Islands, 1860) 

Me saba plenty. (Gilbert Islands, 1860) 

Canoe too little, by and bye broke — All man go away, canoe gone, very good 
me stop. (Lifu (Loyalty Islands), 1850) 

d. He too much bad man. (Kosrae, 1860) 


oop 


The features include got meaning ‘have’, all the same ‘like, similar to’, saba (= savvy) 
‘know’, plenty ‘a lot (of)’, by and by ‘later, in the future’, stop ‘stay, be at’, and too 
much ‘very’. 
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The second stage of development came with the beginning of the Pacific labor 
trade in 1863. Islanders, mainly from the Gilbert Islands, New Hebrides, and 
Solomon Islands were recruited to work on plantations in Queensland (Australia) 
and later in Fiji, as described above. Unlike in Fiji, however, English (or rather 
the English-based pre-pidgin) was the language used to run the plantations in 
Queensland. The pre-pidgin then became the lingua franca among the linguistic- 
ally diverse laborers, and with continued use, a stable pidgin language began to 
develop — early MP.* Like other restricted pidgins, this had a small vocabulary, 
mainly based on English, no grammatical inflections, and only a few grammatical 
rules. Some examples (from Keesing 1988: pp. 43-5) are: 


(7) a. White man allsame woman, he no savee fight. (Kolombangara (western 
Solomon Islands), 1880) 
b. Suppose me come along school, by-and-by me no savee fight. ‘If 1 come to school, 
I won't be able to fight.’ (Bundaberg, Queensland, 1886) 
c. Meno care, me no belong this fellow place, man here no good — rogue. (Tanna, 
New Hebrides, 1877) 


These examples show the continued use of allsame from all the same to mean ‘like, 
similar to’ (as in example 6a); savee (saba in example 6b) meaning ‘know how to’ 
and extended to mean ‘be able to’, and by-and-by (6c) used to indicate the future. 
Also shown is the emergence of along as a general locative preposition. All these 
features have correspondences in modern MP. On the other hand, the use of the 
word bad in the pre-pidgin (6d) dropped out while no good was retained. 

Labor recruiting for plantations in German-controlled Samoa began in 1878. 
Since many of the recruits had already worked in Queensland, early MP was 
transported to Samoa. From 1879, large numbers were also recruited from the 
German-controlled New Guinea Islands (especially eastern New Britain, New 
Ireland and nearby small islands). After 1885, however, laborers from the New 
Hebrides or Solomons were no longer recruited for Samoa, and early MP began 
to diverge into two slightly different varieties - one spoken in Queensland and 
one in Samoa. Over 62,000 Pacific Islanders went to Queensland between 1863 
and 1904, and more than 10,000 to Samoa between 1878 and 1913. Twentieth- 
century descriptions of Queensland Canefields English (Dutton 1980) and Samoa 
Plantation Pidgin (Mtihlhausler 1978a) give some indication of the varieties of 
pidgin spoken by these laborers. 

The third stage of development of MP began when the laborers’ contracts 
finished and they returned to their home islands, bringing the developing 
pidgin with them. The pidgin spread rapidly, functioning as a lingua franca. 
It was also used by the large-scale internal labor force that worked on plantations 
in German New Guinea, the New Hebrides, and Solomon Islands at the turn of 
the century. In each of these countries, early MP further stabilized and expanded 
under the influence of the local indigenous languages. Today, there are three 
dialects, differing mainly in vocabulary and a few grammatical rules: Papua 
New Guinea Tok Pisin, Vanuatu Bislama, and Solomon Islands Pijin. (For lexical 
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and grammatical differences between the dialects, see Siegel 1998; 2008; Tryon 
and Charpentier 2004.) 

After MP had stabilized into its separate varieties and spread through the islands, 
it began to be used for new functions, such as imported religion. Tok Pisin was 
developed into a written language by missionaries in the 1930s, and later used 
in newspapers and radio broadcasting. Similar developments occurred with 
Bislama and Pijin. As its use was extended into new areas, MP changed linguis- 
tically to become more complex — e.g. acquiring more vocabulary and more gram- 
matical morphology. Thus, in both function and structure, MP became what is 
called an “expanded pidgin.” 

As MP was expanding, it acquired many of the grammatical features shared 
by its substrate languages, which almost all belong to several closely related groups 
of Austronesian languages, together referred to as Central Eastern Oceanic (CEO; 
Lynch, Ross, & Crowley 2002). These features came into MP via individuals either 
creating new variants by transferring properties from their CEO mother tongue, 
or selecting (subconsciously) the existing variants that were similar to properties 
in their mother tongue. The presence of corresponding features in the mother 
tongues of a large proportion of the individuals led to the reinforcement of the 
transferred and selected variants, and these became incorporated into the 
expanding pidgin (see Siegel 2008). Keesing (1988) identifies seven “core syntac- 
tic structures” of the CEO substrate languages that are found in MP, expressed 
with forms from the lexifier, English. These are: 


(a) SRP in the verb phrase 

(b) transitive suffix on verbs 

(c) adjectives functioning as stative verbs 

(d) preverbal causative marker 

(e) postnominal possessive marker 

(f) third plural pronoun used as a plural marker 

(g) exclusiveness and dual number marked in the pronoun system 


Features (a), (b), and (f) can be seen in the following examples from Bislama: 


(8) a. Manya i stil-im mane. 
man DET 3SG.SRP steal-TR money 
‘This man stole the money.’ 
b. Ol woman oli kat-em taro. 
PL woman 3PL.SRP cut-TR taro 
‘The women cut the taro.” 


The SRP i is derived from the English word he. The plural SRP oli, used mostly 
for human subjects, is derived from the third person plural pronoun ol (> all) plus 
the already existing marker i. Ol is also used prenominally as the plural marker, 
but it is now obsolete as the third person plural pronoun in Bislama, having been 
replaced by olgeta. 
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A preverbal causative marker mek- (from make) is shown in this example from 
Pijin Jourdan 2002: 135): 


(9) mifala nating mek-rere yet fo disfala gogo blong mifala. 
1PL.EXCL NEG CAUS-be.ready yet for this trip POSS 1PL.EXCL 
‘We have not yet prepared anything for our trip.’ 


The postnominal possessive marker in Tok Pisin is bilong (< English belong): 


(10) mi luk-im haus  bilong papa  bilong yu. 
1SG see-TR house POSS father POSS 2SG 
‘Isaw your father’s house.’ 


With regard to the pronoun system, Bislama, for example, follows the CEO 
pattern, indicating dual (and trial), as well as the inclusive/exclusive distinction. 
This is done with forms derived from English pronouns as well as numerals, tu 
‘two’ or tri ‘three’, and -fala, a pronominal plural marker, derived from fellow: 


(11) singular dual trial plural 
First person yumitu(fala)  yumitrifala = yumi 
inclusive 
First person mi mitufala mitrifala mifala 
exclusive 
Second person = yuu yutufala yutrifala yufala 
Third person hem/em tufala trifala olgeta 


1.5 Other English-lexified pidgins 


At least three other Pacific pidgins lexified by English have been described in the 
literature, but these did not stabilize or expand to the extent that MP did. The 
first is Papuan Pidgin English (Mihlausler 1978b), spoken from the 1880s well 
into the 1900s, mainly in dealings of indigenous people with English speakers in 
Papua (British New Guinea). It was most closely linked to the English-lexified 
pidgins used in Queensland and the Torres Strait, but was also influenced by early 
MP. The reasons that Papuan Pidgin English never stabilized or expanded were 
twofold: First there were no large-scale plantations or other industries with 
laborers from diverse locations. Second, as mentioned above, Police Motu was 
promoted by the colonial administration and eventually displaced the English- 
lexified pidgin. 

Another variety is Nauru Pidgin English (Siegel 1990). This was spoken at least 
up until the 1980s on the tiny island of Nauru, primarily in commercial inter- 
actions between Chinese shopkeepers and indigenous Nauruans or temporary 
residents, mainly from Kiribati and Tuvalu (formerly Gilbert and Ellice Islands) 
but also from Fiji, Solomon Islands, Australia, New Zealand, Britain, India, and 
the Philippines. The origins of the pidgin may go back to the phosphate industry 
which began in 1908. Laborers were imported from both China and other Pacific 
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islands, at first mostly from the Micronesian Caroline and Marshall Islands, with 
small numbers also from New Guinea, and later (in the 1950s) primarily from 
Gilbert and Ellice Islands. Nauru Pidgin English has a mixture of distinctive 
features from Chinese Pidgin English — such as the numeral classifier piecee — and 
MP - e.g. the -pela/-fala adjective marker. It also has at least 17 features shared 
by the two varieties, plus some unique features, such as lexical items from 
Cantonese (e.g. yduh ‘there is [existential], have’), Kiribati (e.g. tekimoa ‘thief’), and 
Nauruan (e.g. kumo “pig, pork’). 

The third English-lexified pidgin, Ngatikese Pidgin or Ngatikese Men’s 
Language, originated under very different circumstances (Tryon & Charpentier 
2004: 145-9). Ngatik is a small island in the Sapwuahfik Atoll, approximately 
140 kilometers southwest of Pohnpei (formerly Ponape, now part of the Federated 
States of Micronesia). In 1837, the crew of the trading ship Lambton, with the help 
of some Ponapeans, landed on Ngatik, and massacred nearly the entire adult male 
population. After the massacre, the island was populated by men from nearby 
islands, especially Pohnpei, and by European sailors and beachcombers. It seems 
that the Pacific-wide English-lexified pre-pidgin spoken in Micronesia at the time 
was used for communication among the diverse population, and this developed 
into what is called Ngatikese Pidgin. Today the first language of the 500 residents 
of the island is the Sapwuahfik dialect of Ponapean, and this pidgin has very 
restricted use. It is spoken only by adult males especially when they are involved 
in communal activities, such as fishing, although it is said to be understood by 
women and children. With regard to its linguistic features, the pronouns, demon- 
stratives, TMA markers, prepositions, and articles are derived from English while 
the nouns, verbs, and adjectives are from both English and Ponapean, although 
more commonly the latter. This gives the language the appearance of more a mixed 
language (e.g. Matras & Bakker 2003) than a typical pidgin or creole. 


2 New Languages: Creoles 


In some contexts, people in a newly emerging mixed community use a pidgin on 
a daily basis, and some of them shift to it as their primary language, which they 
speak to their children. Because of this extended use, the pidgin would already 
be expanded or in the process of expanding. Thus, children growing up in this 
context acquire the expanded pidgin as their mother tongue, and it becomes their 
community language. At this stage it is then called a “creole.” Like any other 
vernacular language, a creole has a community of native speakers and complete 
range of informal functions, and as well, it has a full lexicon and a complex set 
of grammatical rules. 

Some controversy exists, however, over applying term “creole” to some 
contact languages. In Melanesia, for example, many people have recently been 
marrying outside their traditional language groups, especially in urban areas. So 
often the common language of the parents is a variety of MP, and this is what 
their children acquire as their first language. Because of this nativization (the 
process of a pidgin becoming a native language), some linguists say that MP is 
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now a creole, emphasizing that it has thousands of first-language speakers and 
has the functions and grammatical features found in typical creoles. However, 
others say that MP is still an expanded pidgin, pointing out that more than 90 per- 
cent of its speakers still use their ancestral language and learn MP as a second 
rather than a first language. In contrast, creole-speaking populations generally have 
shifted from their ancestral languages and are monolingual. Furthermore, MP is 
not the vernacular language of any distinct, newly emerged community. 


2.1 English-lexified creole languages 


Hawai‘i Creole 

A clear example of a creole in the Pacific region is Hawai‘i Creole (generally called 
“Pidgin” by its speakers). The early history of Hawai‘i and the concurrent use of 
Jargon Hawaiian and South Seas Jargon were described above. The sugar indus- 
try expanded rapidly in the last quarter of the nineteenth century, and imported 
large numbers of laborers from other countries. In 1884, there were approximately 
18,200 Chinese, 10,000 Portuguese, 6,600 “other Caucasians,” 100 Japanese, and 
1,400 others living in Hawai‘i, in addition to 40,000 Hawaiians and 4,200 “Part- 
Hawaiians” (Reinecke 1969: 42). Pidgin Hawaiian continued to dominate on the 
plantations at this time, but the first generation of immigrants (G1) continued to 
maintain their own languages (Roberts 2005). Because the different ethnic groups 
were segregated on the plantations, the locally born children of immigrant labor- 
ers (the G2) acquired their parents’ language and did not socialize with other 
children until they started school. There they learned the languages of their class- 
mates from other ethnic groups, including Hawaiian or Pidgin Hawaiian, as well 
as some English. 

However, off the plantations was a different story. Varieties of an English pre- 
pidgin, with features of both South Seas Jargon and Chinese Pidgin English, were 
being used for interethnic communication in Honolulu and other urban areas, 
and a distinct Hawai‘i Pidgin English (HPE) began to stabilize. Early HPE was 
characterized by the features typical of a restricted pidgin: no inflections on nouns 
or verbs; no copula, existential marker, or complementizers; the use of adverbs 
(such as by and by and all time) instead of tense or aspect markers; and a single 
preverbal negator no. Examples from before 1899 (Roberts 2005: 249, 150, 163) 
illustrate these features and some similarities to South Seas Jargon: 


(12) a. Melican man he too much smart. 

‘Americans are too smart.’ 

b. Ae (yes), he only boy now, no got sense. By ’n by he man, he good. 
‘Yes, he is just a boy now, lacking wisdom. When he is a man, he will 
be good.’ 

c. Today go court house buy license, go church make marry, all same haole 
[Caucasian] style. 
‘Today I’m going to the court house to buy a license and going to a church 
to get married, just like whites do.’ 
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The situation began to change in the late 1880s with increased immigration 
from Japan. By 1890 there were over 12,600 Japanese in Hawai’i and after 1900, 
a large number of G2 children of Japanese ethnicity entered the schools. In addi- 
tion, in the first decade of the twentieth century there was an influx of laborers 
from Korea, Puerto Rico, Spain, and the Philippines. When the immigrant 
population was speaking a dozen or more mutually unintelligible languages, the 
English-lexified pidgin HPE came to be used more widely as the language of 
interethnic communication, especially among the G2, many of whom had left the 
plantations. 

The next change occurred from around 1895 to the 1910s, when older G2 chil- 
dren and adults began to shift to HPE as their primary language (Roberts 1998; 
2005). With this extension of use, HPE began to expand grammatically. Examples 
from the G2 show that by 1920, many grammatical morphemes had developed 
and were frequently used where lexical items or @ were found in early HPE. These 
examples come from historical attestations provided by Roberts (1998; 2005) and 
data from interviews of a group of male speakers — seven locally born before 1905 
and one foreign born in 1904 but arriving in Hawai’ia year later (Bickerton 1977).° 

First, there was the development of a TMA system, with bin V for past, go(n) 
V for future/irrealis, and stay V or V-ing for progressive: 


(13) This fella bin see. 
‘This person saw (it/him/her).’ (1909; Roberts 2005: 180) 


(14) lawya gon teik, e. 
‘Lawyers are going to take (money), aren’t they?’ — (Bickerton 1977: 113) 


(15) a. Wan taim wen we go hom in da nait dis ting stei flai ap. 
‘Once when we went home at night these things were flying about.’ 
(Bickerton 1977: 18) 
b. maeshin shap hi teiking, si? 
‘He’s apprenticed in a machine shop, see? (Bickerton 1977: 101) 


Second, there was development of a copula stay in locatives, an existential (and 
possessive) marker get, and a complementizer for: 


(16) That time Sing Ping no stay, about 12 o'clock Sing Ping come home. 
‘Sing Ping wasn’t here at that time; he came home about 12 o’clock.’ 
(1904; Roberts 1998: 23) 


(17) I believe get all black paint. 
‘I believe there was just black paint.’ (1923; Roberts 2005: 176) 


(18) You speak you want one good Japanese man for make cook. 
‘You said you wanted a good Japanese man to cook.’ 
(1905; Roberts 1998: 29) 
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Evidence exists that the development of at least some of these expanded features 
of HPE was influenced by reinforcement or transfer from the substrate languages 
that were dominant at the time that the language was expanding. Of the immi- 
grant languages, these were Portuguese and Cantonese. Although the number of 
speakers of Japanese was greater in the total population, in the all-important G2 
that first shifted to HPE, the numbers of Portuguese and Cantonese speakers 
were more significant. It is also clear that the Portuguese were the first group to 
abandon their ancestral language, followed by the Chinese (see Roberts 1998; 
Siegel 2000). One of the substrate-influenced features appears to be the use of a 
single form get to express both existential and possessive. This occurs both in 
Portuguese (the verb ter or haver) and in Cantonese (yduh). And the use of stay 
for both the locative copula and verbal auxiliary in progressives is parallel to that 
of Portuguese estar. (For more details and other features, see Siegel 2000; 2008.) 

From approximately 1905 to the 1920s, the number of children of the G3 
increased rapidly. These were the children of the first locally born generation 
(G2) who had shifted to the pidgin as their primary language. Thus the children 
of the G3 heard the expanded HPE from their parents, and later from their peers, 
and acquired it as their first language — and in most cases, only language. Thus, 
Hawai‘i Creole emerged with the G3, the children of the locally born children 
of the original immigrants.° Of course the G3’s creole was not exactly the same 
as their parents’ HPE, as the children regularized the still variable input, and adopted 
some features but not others. 


Other English-lexified creoles 

Two other English-lexified creoles developed in the Pacific under very different 
circumstances. The first was on Pitcairn Island in the central South Pacific. In 1792, 
after the famous mutiny on the Bounty, 9 mutineers (5 from England, 2 from 
Scotland, 1 from the USA, and 1 from St. Kitts in the Caribbean) settled on the 
uninhabited island with 6 men and 12 women from Tahiti (some were originally 
from Tubuai). The Tahitians were treated almost as slaves, and in 1794 the men 
revolted. The violence resulted in their deaths and left alive only four of the 
original mutineers. After a few years of peace, more violence erupted and that 
along with illness led to further deaths. In 1800, there was only one surviving 
man, John Adams, left on the island, with 10 women and 23 children. When Adams 
died in 1829, the population was 80. 

It is likely that the first children born on the island were bilingual in the mutin- 
eers’ language, English, and their mother’s language, Tahitian. But a pre-pidgin 
also developed on Pitcairn that had a mixture of features of the English dialects 
of the mutineers, the Tahitian of the women, South Seas Jargon used by visiting 
whalers, and perhaps an Atlantic contact variety, spoken by the mutineer from 
the Caribbean. It is possible that the first island-born generation stabilized this 
variety and adopted it as their primary language, at which time it expanded. When 
it became the first language of the second generation of children born on the island, 
it could then be classified as a creole. But throughout its history, this creole has 
differed from others in that a large proportion of its speakers have been literate 


Contact Languages of the Pacific 827 


in English, and exposed to the Bible and religious texts (Ingram & Mihlhdusler 
2004: 785). 

In 1856, all of the 194 people living on Pitcairn moved to Norfolk Island, a larger 
island near Australia that had been a penal colony. A few families returned to 
Pitcairn in 1859 and others in 1864. The splitting of the community resulted in 
two different varieties of the language: Pitcairn (now sometimes spelled Pitkern), 
and Norfolk (sometimes spelled Norf’k). The precise differences between the 
modern varieties are not well understood, despite fairly recent studies of both 
(Kallgard 1993 on Pitcairn; Ingram & Mihlhausler 2004, and Mihlhausler 2004 
on Norfolk). Nevertheless, since Pitcairn Island remains a fairly isolated colony 
of Great Britain, while Norfolk Island has been a territory of Australia since 1901 
(partially self-governing since 1979), it is thought that Norfolk is closer to English 
because its speakers have more frequent contact with tourists and other outsiders. 
Today, speakers of Norfolk all know and use standard English, and reserve Norfolk 
for informal in-group communication. However, both Pitcairn and Norfolk are 
not as commonly spoken as before and the language as a whole is considered 
endangered (Mihlhausler 2004: 799). 

Very little is known about the other English-lexified creole, but its origins appear 
to have been similar to those of Pitcairn/Norfolk. It developed in the Bonin 
(Ogasawara) Islands, which were uninhabited until 1830, after 20 settlers arrived: 
5 Europeans (3 English speakers, 1 Genoese, and 1 Danish) and 15 Pacific 
Islanders (10 men and 5 women), mostly from Hawai’‘i but also from Tahiti and 
the Marquesas (Long 1999; 2007). Later settlers spoke dozens of languages, from 
Europe, the Pacific, and Asia. Reports indicate, however, that the lingua franca 
was English, although a list of words collected there by shipwrecked Japanese 
sailors in 1840 contained items of both Hawaiian and English origins. Because 
of many visiting ships after the initial settlement, it is assumed that the settlers 
were also exposed to the South Seas Jargon that was used all over the Pacific at 
that time. Thus, it is thought that an English-lexified contact variety developed 
among the first generation of settlers, influenced by Hawaiian (perhaps Jargon 
Hawaiian, see above) and South Seas Jargon. This became the first language of 
either the first or second generation born on the island, and thus a creole. After 
Japan colonized the islands in the 1870s, the original settlers continued to speak 
“English” along with Japanese, and they maintained their language into the 
twentieth century. However, with increased travel away from the islands, more 
education in English, and the presence of the American military after World War 
II, the language became closer and closer to mainstream English. 


2.2 Pidgins and creoles lexified by other 
European languages 


Although German-lexified contact languages are rare in the world, one did emerge 
in German New Guinea: Rabaul Creole German, also known as Unserdeutsch 
(Volker 1991). This developed at a boarding school for mixed race children estab- 
lished by Catholic missionaries in 1897 at Vunapope, near the capital, Rabaul. These 
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were the children of local women who had relations with men from other parts 
of the world, including Germany, Australia, Micronesia, China, Ambon, and the 
Philippines. The children came to the school at a young age, knowing a bit of 
their mother’s language and an early form of Tok Pisin. But at school they were 
taught German and that was the only language they were allowed to speak. The 
German-lexified contact language seems to have resulted from the students 
using German words in Tok Pisin sentences. This relexified Tok Pisin stabilized 
quickly in the relatively isolated dormitories, and remained as an in-group lan- 
guage, even though the students eventually progressed in standard German. When 
the students left school and became adults, many of them married other former 
students and settled around the Vunapope mission. They continued to use their 
own form of German, which they called Unserdeutsch, among themselves, and spoke 
a local colloquial form, which they called Normaldeutsch, with other Germans. This 
continued even after Australia took over in 1914. However, with the coming of 
World War II, the situation began to change; teaching of German was prohibited 
and the community became more mobile. At the time of independence, most of 
the community moved to Australia, and more than a thousand of their descen- 
dants live in southeastern Queensland. 

In several ways, Unserdeutsch is not a typical creole. First, it did not arise 
from a pidgin or pre-pidgin needed as a medium of wider communication 
among speakers of different languages, since the students already had such a 
medium in Tok Pisin. Second, its speakers were bilingual in a more standard form 
of German. But it is typical of a creole in that it filled the need for a distinctive 
in-group language for a newly emerged mixed community. 

France has played a much greater role in the Pacific than Germany, yet French- 
lexified contact languages are also rare. A French-lexified pidgin is said to have 
been used in plantations in the New Caledonia region in the latter half of the nine- 
teenth century after the French took over the islands (Corne & Hollyman 1996), 
but little is known about it. 

More is known about the French-lexified creole of the French territory of New 
Caledonia, Tayo, also known in French as Patois de St-Louis or just Patois. It is 
currently spoken primarily in the village of St-Louis, about 15 kilometers from 
the capital, Nouméa. It has about 2,000 speakers, half of these the permanent inhab- 
itants of St-Louis and the other half former residents who live elsewhere around the 
territory. Information on the origins of Tayo comes from Ehrhart and Corne (1996). 

St-Louis was established at an uninhabited site by French Marist missionaries 
in 1860. It was to be a village for new converts and a training center for catechists. 
From 1860 to 1880, speakers of as many as 20 different Melanesian languages were 
attracted to St-Louis, but three mutually unintelligible languages were the most 
common: Cémuhi, Drubea, and Xaracit. 

The Melanesian settlers were exposed to French in varying degrees at the 
mission, where they went to school and church and worked in the sawmill, rice 
paddies, sugarcane fields and vegetable gardens. During the first 20 years of the 
settlement, a French-lexified pidgin began to develop as the lingua franca. This 
pidgin became more and more important as the medium of communication at 


Contact Languages of the Pacific 829 


St-Louis, especially among the first locally born generation. According to oral 
tradition, this generation was bilingual in the pidgin and the Melanesian language 
of their group, but the next generation, especially those born after around 1920, 
acquired the pidgin as their first language and had only passive knowledge of 
their parents’ and grandparents’ first languages. This is when the creole now known 
as Tayo became established in the community. (Note that this matches the three 
generational shift scenario that occurred in Hawai'i.) Today there are very few 
speakers of Melanesian languages left in St-Louis. 

As in Melanesian Pidgin and Hawai'i Creole, the expanded features of Tayo 
demonstrate influence from the substrate languages. With regard to the verb phrase, 
Tayo has preverbal TMA markers derived from French forms but corresponding 
to markers that occur in all three of the major substrate languages or at least in 
the two most dominant ones when expansion was taking place: Cemuhi and Drubea. 
These include a future maker va (from va, the most common form of the French 
verb aller ‘to go’), a progressive aspect marker antrande (from en train de ‘in the 
process of’), and a past accomplished or completive marker fini (the past participle 
of finir ‘finish’).’ The following examples come from Ehrhart (1993): 


(19) a. bon la va _ rantre se swar...(p. 220) 

good 3SG FUT come.home DEM evening 
‘OK, she will come home this evening. . .’ 

b. nu antrande mwanche chokola (p. 118) 
1PL PROG eat chocolate 
‘We are eating chocolate.’ 

c. pi kan sola fini labure later sola plante mais (p. 246) 
and when 3PL ACP plough earth 3PL plant maize 
‘And when they had finished ploughing the earth, they planted the maize.’ 


All three of the main substrate languages also have a preverbal marker of evi- 
dential modality, asserting the reality of the event or state being reported. Tayo 
appears to have a similar marker: ryanke (from French rien que ‘nothing but’) and 
its variant arike — for example: 


(20) sola anke fe an gran barach si larut 
3PL EVID make a big barricade on road 
‘They’ve (certainly) made a big barricade on the road.” 


This contradicts the claim (McWhorter 2001) that evidential marking is not found 
in creoles. 


3. New Dialects 


Two types of new dialects may emerge in contact situations: mixed dialects (or 
koines), which result from dialect contact; and indigenized varieties, which result 
from language contact.’ Examples of each are found in the Pacific. 
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3.1 Koines 


A koine is a new dialect that results from a changed pattern of contact between 
dialects of the same language. For example, an “immigrant koine” may develop 
when speakers of different regional dialects relocate to a new location and 
together form a new community (Siegel 1985). The process of koine formation is 
called koineization, and it often leads to intermediate or “interdialect” forms, not 
found in any of the contributing dialects (Trudgill 1986). In the first stage of koine 
formation, the pre-koine, the various dialects in contact are used concurrently for 
communication. A stable koine emerges much as a stable pidgin does — i.e. when 
leveling occurs, and some variants are eliminated while others are retained. 
Often the resulting koine is formally simpler than any of the contributing dialects 
— for example, in having fewer marked grammatical categories. 

Fiji Hindi (sometimes called Fiji Hindustani or Fiji Bat), the informal language 
of most Indo-Fijians, is an immigrant koine comprised mainly of features of 
several of the regional dialects of Hindi. These were spoken by the more than 
45,000 indentured laborers who were brought to Fiji from North India between 
1879 and 1916. Most of the lexicon and grammatical morphology of Fiji Hindi 
come from Eastern Hindi dialects such as Awadhi, and from Bihari dialects, mainly 
Bhojpuri. Other grammatical features come from the Pidgin Hindustani spoken 
on the sugar plantations, as mentioned above. Mixing is evident in the pronoun 
system, with forms from all three sources just mentioned. The pronoun system 
shows simplification as well, with the loss of the intimate second person pronouns, 
leaving only familiar and formal categories. Also, the system is more semantic- 
ally transparent with log ‘people’ joined to the singular pronoun to form the 
plural for animates — e.g. ham 1SG, hamlog 1PL. An interdialect form is the 
second person possessive pronoun tumar (tohar, tuhar, or tum(h)ara in the contributing 
dialects). With regard to the lexicon, many items also come from Fijian, especially 
for names of local flora and fauna (such as dalo ‘taro’ and walu ‘kingfish’), and 
from English (such as room, towel, book, and reef). However, for some, semantic 
shift has occurred — for example, gate means ‘field or paddock’. 

Another immigrant koine began to emerge in Fiji in the early twentieth cen- 
tury among former plantation laborers from the Solomon Islands who stayed on 
in Fiji at the end of their contracts. This was known as Wai (Siegel 1987: 211-33). 
It was a mixture of dialects of North Malaitan. Although five separate languages 
are often distinguished today — Lau, To’aba’ita, Baelelea, Baegu, and Fataleka — 
they are mutually intelligible, and generally considered to be “major dialects 
or sublanguages” of a single language (Tryon & Hackman 1983: 27). Again, the 
pronoun system is illustrative, with a mixture of forms from different dialects and 
the loss of pronouns distinguishing dual and trial first person exclusive from the 
plural. An example of an intermediate form is rayguva ‘grease’ (ragufa, raraya, or 
ndila in the contributing dialects). It appears, however, that Wai did not stabilize 
before it disappeared when the Solomon Islanders shifted to the Fijian language 
as they became integrated into the Fijian community. 
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A koine based on dialects of Japanese began to emerge in the Palauan islands 
in the western Pacific (described by Matsumoto & Britain 2003). Japan occupied 
the islands (now the Republic of Palau, or Belau) from 1914 to 1945, and there 
was a mass migration of Japanese workers. In 1941, there was a total of 23,980 
Japanese, compared to only 6,514 Palauans (Matsumoto & Britain 2003: 43). The 
workers came from 11 districts of Japan, representing all four major dialect 
regions: Eastern Honshi, Western Honsht, Kytshi, and Ryikya (Okinawan). 
It appears that Palauan Japanese was a mixture of features of dialects from these 
regions. The main contributors, however, were the Eastern dialects (from the Kanté 
[Tokyo], Tohoku, Hokkaido, and Tokaido districts), whose speakers were ori- 
ginally dominant in number. Because of a great deal of interaction with the 
Japanese, including intermarriage, Palauans, also learned the language, and their 
second-language versions of Japanese also contributed to the mix. It appears that 
variants from Okinawan and other nonprestigious dialects were leveled out. But 
it is not clear if Palauan Japanese got past the pre-koine stage because surviving 
speakers use variants from both Eastern and Western dialects for the same 
function. 

Another example of mixing of Japanese dialects occurred in Hawai‘i among 
the more than 200,000 Japanese imported as plantation laborers from 1884 to 
1924 (described by Hiramoto 2006). Again, immigrants came from all four major 
dialect regions, but in contrast to Palau, the most dominant dialects were from 
the Western region, especially the Chaugoku dialect. Also in contrast to Palau, other 
ethnic groups did not learn Japanese. Recordings of the first generation of immi- 
grants reveal them using some lexical and morphosyntactic forms from dialects 
other than their own. Some speakers also used an interdialect form daké for a 
conjunction (Eastern dakara and Western jaké), and loanwords from English, 
including the pronoun me. However, it appears that a stable koine did not 
emerge in the second generation before they shifted to Hawai‘i Creole as their 
primary language. 


3.2 Indigenized varieties 


Indigenized varieties are new dialects that have arisen in colonies where the colo- 
nial language has had widespread use in the education system, and has been learned 
as a second language by a large proportion of the population. Like an expanded 
pidgin, an indigenized variety is used in a multilingual environment and functions 
as a lingua franca for daily interactions among speakers of different languages. 
Unlike an expanded pidgin, however, its grammatical rules are much closer to 
those of the lexifier (the colonial language), although some of the lexicon, phono- 
logy, and morphosyntax are influenced by the indigenous substrate languages 
(thus, indigenized). Nearly all research on indigenized varieties has been done 
in countries around the world where English was the colonial language until 
independence in the latter half of the twentieth century — for example, Singapore, 
India, and the Phillipines. Thus, the terms New Englishes, World Englishes, and 
Postcolonial Englishes are often used as well. 
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Two indigenized varieties are found in the Pacific: Fiji English (Siegel 1989; Mugler 
& Tent 2004; Tent & Mugler 2004), and Papua New Guinea (PNG) English (Smith 
1978; 1988). Fiji English has many lexical items from both Fijian (e.g., sulu‘sarong’ 
and kasou ‘drunk’) and Fiji Hindi (e.g., roti ‘Indian flat bread’ and choro ‘steal’). It 
also has a pronoun system influenced by Fijian, with gang as a plural marker (e.g. 
you gang), and us gang as first person plural exclusive versus us two as first person 
dual inclusive. Another influence of Fijian is a preverbal intensifier full as in: 


(21) The fella full sleeping over there. 
‘The guy’s sound asleep over there.’ (Siegel 2008: 125) 


PNG English also has many items from Tok Pisin (e.g. singsing ‘traditional 
singing and dancing’, wantok ‘speaker of the same language’, and kaukau ‘sweet 
potato’), and from indigenous languages, most probably via Tok Pisin as well 
(e.g. bilum ‘string bag’ and buai ‘betelnut’). 

Both varieties have items from English with shifted meaning — e.g. Fiji English 
grog ‘kava’ and PNG English rascal ‘thug, criminal’. And they have many 
instances of grammatical shift — e.g. Fiji English broom the floor ‘sweep the floor’ 
and PNG English be aftering someone ‘be following someone’. 

Even though there has been hardly any contact between Fiji English and PNG 
English, they also share many morphosyntactic features, and these are found 
in other indigenized varieties as well. These features include regularization of 
plurals to include non-count nouns (e.g. furnitures), phrasal verbs where simple 
verbs exist in standard varieties (e.g. cope up), no change of word order for ques- 
tions, invariant question tags, variable copula/auxiliary use, and one used as an 
indefinite article. Some of these are illustrated in examples from Fiji English (Mugler 
& Tent 2004: 775): 


(22) a. Jone and them coming to the party tonight, éh? 
b. They should have one security guard up here at night sitting in one shed. 


Conclusion 


This chapter has shown the diverse origins and linguistic features of Pacific con- 
tact varieties. However, some commonality in their development can also be seen 
in various shared characteristics. These are consequences of the overgeneraliza- 
tion and lack of complexity that are found in early second language acquisition. 


NOTES 


1 Drechsel (1999; 2007) also provides some evidence of a maritime Polynesian pre-pidgin. 

2 Standard Fijian orthography is used in examples of both standard and pidginized Fijian: 
<b> = /™b/, <c> = /6/, <d> = /"d/, <g> = /n/, <q> = /*g/. Capital letters are used for 
non-prenasalized stops not found in standard Fijian: <D> = [d], <G> = [g]. 
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Other grammatical abbreviations used in the examples in this chapter are: 


ACP accomplished EVID evidential 
ART article EXCL exclusive 
ASP aspect FUT future 
CAUS causative LOC locative 
DEM demonstrative NEG negative 
DIR directional 


According to Keesing (1988), a single, distinctive Pacific Nautical Pidgin developed in 
the central Pacific before the 1860s and this was the forerunner of Melanesian Pidgin. 
However, historical research does not back up this claim (see Baker & Miihlhausler 1996). 
Bickerton calls these “early creole speakers” (1977: 333) but clearly states that they were 
“nonmonolingual” as opposed to the “monolingual” creole speakers who were born 
later. Of course, it must be kept in mind that these early speakers were not recorded 
until the 1970s and could have been influenced by later developments in the language. 
Note that this description of the origins of Hawai‘i Creole is based on the recent findings 
of Roberts (2005) and differs from that presented by Bickerton (e.g. 1981), see Siegel 
(2008). 

The orthography used here is an adapted version of the lortograf-linite system created 
by Baker and Hookoomsing (1987). The major differences between this system and IPA 
are that tch = [c] and ch = [J]; also, either an or am = [4] and on or om = [6]. 

This example was collected by Chris Corne before his death in 1999. (See Siegel 2008: 
223.) 

Of course, it is often difficult to distinguish whether two varieties are separate languages, 
or dialects of the same language. While there are some clear cases of language versus 
dialect, there is no precise linguistic dividing line that can distinguish them. Similarly, 
the other types of contact varieties described here are graded phenomena rather than 
essential categories. For example, there are pidgins, such as Hiri (Police) Motu, that fall 
somewhere between the restricted and expanded categories. And as we have seen with 
Melanesian Pidgin, it is sometimes hard to draw the line between an expanded pidgin 
and a creole. 
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pidgins in Europe 423-4 
pidgins/creoles outside Europe 424-5 
stress in 395, 396 
substratal influences 406-8 
superposition 422 
vigesimality in 383-4 
volume of borrowing 133 
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Gibraltar, Spanish in 552, 558 
glottalization 211 
GOAT-fronting in Milton Keynes 244-5 
grammatical features, stability 141 
grammatical replication as linguistic 
process 100 
grammatical transfer, categorial 
equivalence 12 
grammaticalization 
areas 97-8 
Chinese 761 
constraints on 99-100 
vs. contact 87 
contact-induced 88 
and convergence 68-9 
forces in 94—7 
ordinary 89-90 
previous studies 4 
of South Asian languages 745-53 
and unidirectionality 69, 75, 99 
universals in 99 
use patterns in 89 
Great Famine in Ireland (1845-8) 153 
Greek 
convergence 73-4 
dialects and the Koine 58 
Fertek 75-6 
loanwords in Arabic 638 
and Turkish 181, 190 
Greek Cypriot and English, code- 
switching 195, 199-201 
Guarani, contact with Spanish 553 
Guayaramerin, Bolivia 569-70, 572 
Guinea French 708-9 
Guiné-Bissau, creole in 257 
Gurindji 782 
code-switching in 783 
Kriol 785-7 


habitual forms, sub-Saharan English 
530-1 

Haida, parallels with Tlingit 678-9 

Hakka 759, 764 

Hakohol (Hassidic newsletter), 
code-switching in 189 

Halbdeutsch 423 

Hanseatic League 422 

Hawai'i Pidgin English (HPE) 824-6 

Hawaiian Creole English (HCE) 764 


Hawaiian Maritime Pidgin 815, 818 
heritage language groups 344 
hierarchy of constraints on structural 
borrowing 184 
hihi sublist 135, 136-7 
Hindi and Punjabi, code-switching 191 
Hindustani, use in Fiji 817, 830 
Hiri Motu 818 
historical change 10-11 
historical distinction between borrowing 
and inheritance 130-1 
historical explanations for linguistic 
change 35 
historical linguistics 266 
vs. sociolinguistics 46 
history of languages, previous studies 4 
Hokan macrofamily 370-1 
Hokkien, reduplication 511 
Hong Kong University, MIX variety 194 
Heoyanger, as a new town 240-2 
humour and code-switching 200 
Hungarian 
contact with Finno-Ugric 600, 601-2 
speakers in Burgenland, Austria 
610-11 
hybrid approach 52 
and borrowing 56 
to genetic classification 50-1 
hybrid forms 
of dialects in the Fens 223 
resulting from innovations 211 
hybridization of Australian languages 
785-7 


Iatmul, borrowing in 799 
identity and code-switching 193-4 
ideology, as borrowing constraint 178 
Tle de Groix, France 37 
imitation 173 

see also borrowing 
Immigrais (variety of Portuguese) 193 
immigrant koines 231 
imperfect learning, as predictor of 

contact-induced change 36 

implicational hierarchies of borrowing 79 
importation 173 

see also borrowing 
imposition 20, 456 

definition 19, 171 


indefinite article, in contact-induced 
grammaticalization 92-4 
Indian English in South Asia 742-3 
Indian South African English (ISAE) 295 
indirect morphosyntactic diffusion in 
Arnhem Land 775 
Indo-Aryan languages 
contact 738-9 
grammaticalization 746-53 
previous studies 740 
Indo-European 
history 380-1 
number construction 381-8 
relationship to Finno-Ugric 605-6 
stress in 395-7 
two copulas in 389-95 
Indo-Uralic family 366 
inflectional morphology, borrowing 176-7 
inflectional systems, hard to borrow 41 
innovations 173 
attitudes to 38 
in the Fens 224 
interlocutors’ reactions 72, 217-18 
social evaluations 211 
speech errors as 33 
without linguistic change 88 
innovative transfer, definition 19 
Insular Celtic languages see Celtic 
languages 
integration, degree of, as predictor of 
linguistic change 44-5 
intensity of contact, as predictor of 
linguistic change 36-8 
interaction, essential in language contact 
31-2 
interdialect forms in the Fens 223 
interdialectal interference, and minimal 
typological distance 40 
interference, definition 170 
interjections, Balkan languages 626-7 
internal change 32 
vs. contact, naturalness 87 
as default explanation 34 
vs. external change 7, 35 
learnability as explanation 34 
regularization 17-18 
and universal markedness 44 
intertwined languages 183-5 
see also mixed languages 


Subject Index 853 


intra-individual variability 235 
intrasentential switching vs. single word 
switching 195 

Iranian 

contact with Slavic languages 582 

contact with Turkic languages 657 
Ireland 

Anglo-Normans in 10-11 

language death in 321 

language shift in 152-67 
Trish 

and Anglo-Norman, stress shift 15-16 

and English 11-14 

as null subject language 111, 112-13 
Irish English, two copulas in 394-5 
irreversible solidarity, hypothesis of 332 
Isicamtho (South African variety) 707-8 
Isomorphy Principle 312 
Italian 

koineization 231-2 

and Molisean 90-1, 93 

passive markers 97-8 

and Sardinian, code-switching 191-2 
Italo-Spanish contact language 562-3 


Jalonke, contact with Fula 702 
Japanese 
and English, adaptation of loanwords 
173, 174 
in Hawai‘i 825-6 
influence from Chinese 765 
koine in Palau 831 
loanwords in American English 459 
reallocation of dialects 215 
wh-movement 110 
Jargon Fijian 815-16 
Jargon Hawaiian 815, 818 
jargons, Pacific 815-16 
Jerriais (Jersey French), code-switching in 
197 
Jewish English 462-3 
joke-telling and code-switching 200 
Journal of Language Contact 698 
Juba Arabic 637, 646 
Judeo-Sorbian and Yiddish 56 


Kaaps (South African Dutch) 706 
Kalahari Basin 364 
Kannada 744-5 
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Karachay-Balkar 657-8 
Karawa 796 
Karawari 
metatypy in 801-2 
and trade pidgins 809 
Karelian speakers 611 
Karuk 683 
Kaurna 778 
Kawalisu 688-9 
Khalaj 657 
Khoekhoegowab (Namibian language) 
343-4 
Khuzistani Arabic and Persian, 
convergence 74-5 
Kim (Atlantic language) 703 
King’s Lynn 220 
Ki-Nubi 637, 646 
Kjakhta Chinese-Russian 723 
Koine, the 58 
koineization 231 
and African-American English (AAE) 
457-8 
of Arabic 636 
definition 58 
Dhuwaya 787 
and family tree model 58-9 
in new towns 240-5 
Pacific Islands 830-1 
Kolyma Yukaghir 730-1 
Kombuistaal (variety of Afrikaans) 
295-6 
Kopar and Watam, borrowing between 
799 
Korean, influence from Chinese 765 
Korean-American identity 293 
Kovai and Mangap-Mbula, borrowing 
between 798-9 
Krio 519, 705 
Kriol 334, 779-81 
Gurindji 785-7 
Kru languages 699-700 
Kru Pidgin English 705 
Kuot, metatypy in 802 
kupwarization 744-5 
Kwakw’ala 689-90 


L2 learning see second language 
acquisition 
Langobards, migration 417-18 


language acquisition 
and simplification 310-12 
switch box analogy 108 
and universal markedness 43 
see also second language acquisition 
language change 
evolutionary theory 501-2 
types 500 
language death 
Australia 784 
categories in 326-32 
causes 320-2 
previous studies 4 
as a scenario in contact-induced change 
277 
speed of 322-5 
language drift, and partial restructuring 
258-9 
language grouping, perfect phylogeny 
approach 132-3 
language mixing 192 
language planning 38 
language shift 
and code-switching 191-2 
in Ireland 152-67 
unidirectionality 192 
late system morphemes and 
code-switching 184 
Latin 
contact with early Germanic 415-16 
influence on English syntax 439-40 
influences on Old English 435, 438 
influence on Polish 590 
stress in 395, 396 
Latina magazine, code-switching in 189 
Latvian, contact with Livonian 604 
learnability, as explanation for internal 
change 34 
left dislocation in sub-Saharan English 
529-30 
lenition, Welsh influence on English 446 
leveling 214-15 
of dialects in the Fens 223, 225 
and partial restructuring 258-9 
as a scenario in contact-induced change 
274 
Levenshtein distances 138 
lexical attrition 326-8 
lexical borrowings 172-5 


lexical gaps, filling 15 
lexical substitution 39 
Lexicon-Grammar mixed languages 183-5 
lexicostatistics 130 
lexifier languages and creoles 60 
Liberian English 705 
Lingua Franca 253-4, 255 
linguistic areas 16-17 
previous studies 3 
linguistic dominance and borrowing vs. 
imposition 171 
Linguistic Geography of Africa, A (Heine 
and Nurse) 697 
linguistic predictors of linguistic change 
39-45 
linguistic transfer, types 86 
literature, code-switching in 189 
Lithuania 593 
Lithuanian and Finnic languages 45 
Livonian, contact with Latvian 604 
loan meanings 172-3 
see also borrowing 
loanwords 172 
adaptation 173-4 
in American English 459-60 
in American languages 674 
Arabic 637-9, 643-8 
Australian in English 782-3 
Chinese in English 766 
and code-switching 195, 196 
Germanic in Slavic languages 583 
Iranian in Slavic languages 582 
in Turkic languages 660-6, 667-8 
see also borrowing 
local identity vs. expanded identity 294-5 
lolo sublist 135, 136-7 
London creole 193, 198 
London dialects, supralocalization 215 
London English, Diphthong Shift in 246 
loyalty, as borrowing constraint 178 
Lumbee Indians 294 
Lushootseed (Salishan language) 175 


Ma’a (Tanzania) 
and Bantu 42 
structural borrowing 180 
Macanese 764 
Macassan 778 
Macedonian, convergence 73-4 
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macro-acquisition of English 520 
macrofamilies 

in Africa 363-4 

in Australia 367-8 

definition 361 

in Eurasia 364-6 

in New Guinea 366-7 

in North America 369-71 

in South America 371-2 
Macro-Sudan 363-4 
Madagascar, Arabic in 644 
Madak, metatypy in 802-3 
Maisin language 797-8 
Malay 
contact with English 504 
comparison with Singlish 506-12 
Malay creole and Tamil 98-9 
Malta, Arabic in 639 
Maltese, as mixed language 183 
Manam 
metatypy in 805, 807 
and Watam, metatypy between 800-1 
Mandarin 

vs. Cantonese 757 

contact with other Chinese languages 

762-3 

Mande group of languages 699-700, 703 
expansion 702 
Mangap-Mbula and Kovai, borrowing 
between 798-9 
Maningrida settlement, code-switching in 
783 
Maori English, dialect shifting in 291-2 
Marathi 744-5 
Mari languages 599 
Martha’s Vineyard 289-90 
Martinique, early creole in 255 
Masai (Nilotic language) 134 
matter replication see borrowing 
Mayan languages, contact with Spanish 
554 
Maybrat, borrowing in 799 
Mbugu (Bantu language) 180 
meaning frequency and rate of change 
135 
Media Lengua 3, 183, 274 
Medium for Interethnic Communication 
(MIC) 705 
Mednyj Aleut, as mixed language 184 
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Megleno-Rumanian and Bulgarian, 
minimal typological distance 40 
Mekeo Trading Language 819 
Melanesian language on St-Louis 828-9 
Melanesian Pidgin (MP) 197, 819-22, 
823-4 
Memphis, Tennessee 290 
Mersea Island, Essex 215 
metatypy 
definition 19 
between New Guinea languages 800-8 
as a scenario in contact-induced change 
275-6 
Michif (mixed language) 61, 184, 786 
Middle English 
influences on 436-8 
relative clauses 445 
migration and new dialect formation 246-8 
Mike 287-8 
Milton Keynes, new dialect formation in 
242-5 
Min dialects and Mandarin 762, 764 
minimal convergence and dialects 285-7 
minimal hypothesis 116 
minimal typological distance see 
typological distance 
MIX (Hong Kong variety) 194 
mixed languages 183-5, 333 
and genetic classification 61 
previous studies 3-4 
mixing-bowl metaphor 236, 239 
mobility 212-15 
and accommodation 210 
changing 213 
in dialectology 208 
around the Fens 224 
models from other disciplines 144 
moderate convergence and dialects 287 
Modern Hebrew and Yiddish 56, 57 
Modern Tiwi 785 
Moi, metatypy in 806 
Molisean 
comparative constructions 96 
and Italian 90-1, 93 
Mongolic, loanwords in Turkic 663-4 
Montana Salish 
absence of English and French 
loanwords 38 
and English 42-3 


Mordvin languages 598 
Moroccan Arabic and French 195 
morpheme transparency and borrowing 
179 
morphological agreement system 116 
morphological borrowing 176-7 
morphological structure in American 
languages 684-91 
morphophonology and code-switching 
196 
morphosyntactic borrowing 140-1 
morphosyntactic change 119-20 
mot juste switching 196 
“motherese,” koineization of 787 
motivations for transfer 14-15 
Motu, pidgin 815, 818 
multi-ethnic settings 293-6 
multilingualism 
in Balkan languages 625 
in New Guinea 796-8 
symmetry 343 
music, code-switching in 193-4 
mutual intelligibility and speciation 61-2 


Naga pidgin 743-4 

Nahuatl, contact with Spanish 554 

Nandi (Nilotic language) 134 

Native American shift to English 461-2 

nativization of pidgins 256 

naturalness of contact-induced change 87 

Nauru Pidgin English 822-3 

Navajo 676 

shift to English 322 

Ndjuka 255-6 

necessity in borrowing 177 

negative politeness (NP), and code- 
switching 200 

neglect of grammatical distinctions 12-13, 
161-2 

Neo-Aramaic 641 

Netherlands see Dutch 

network diagram of English accents 
139-40 

networks 

in computational linguistics 129 
vs. trees 133-4 

new dialect formation (NDF) 231 

and migration 246-8 

in new towns 240-5 


new dialect formation (NDF) (cont’d) 
social factors 237-8 
stages 234-6 
New Guinea 
geography 795 
German creoles in 827-8 
macrofamilies 366-7 
multilingualism in 796-8 
New Guinea languages 
borrowing in 798-800 
metatypy between 800-8 
pidgins in 808-11 
new towns 240-5 
new varieties 230, 232 
New World, pidgin formation in 256 
New York City 
dialects 282 
minority ethnic groups 292-3 
New Zealand, dominance and dialect 
shift 291-2 
New Zealand English 
homogenization 238-9 
new dialect formation, social factors 
237-8 
new variety formation 234-6 
Nez Perce, absence of English and 
French loanwords 38 
Ngandi 775-6 
and Ritharngu, structural borrowing 
179 
Ngarinyman 776 
Jgatikese Pidgin 823 
guni speakers 178 
iger-Congo phylum 519 
ilotic languages, borrowing within 134 
Jisgha and English 43 
obiin and Dongolawi (Nubian 
languages) 56-7 
nonbinary categories 14 
nonsystemic vs. systemic elements 11 
Norfolk Island 826-7 
Norman French and English 37, 437 
Norse, influence on Germanic 419-22 
North America, macrofamilies 369-71 
North Carolina, multi-ethnicity 294-5 
Northern Cities Shift 291 
Norwegian, contact-induced simplification 
307-8 
Jostratic 364 
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null subject 109 
in Early Modern English (EME) 114 
in Early Modern Irish English (EMIE) 
113-17 
in Irish 111, 112-13 
numbers, construction in different 
languages 381-8 
Nunggubuyu-Ngandi language family 772 
Nuuchahnulth 680-1, 690-1 
Nyamal 778 


obligatory multilingualism 303-4 
obsolescence, previous studies 4 
Okinawan, loss of autonomy 59 
Old Church Slavonic (OCS) 585 
Old English 

influences on 432-8 

relative clauses 444, 448-9 

two copulas 389-1 
onomatopoeia in Balkan languages 627 
ordinary grammaticalization 89-90 
Ossetic, borrowing 41-2 
outsider complexity 311-12 


Pacific Islands, koineization on 830-1 
Pacific languages 

creoles 823-9 

pidgins 778, 814-23 
Pacific Northwest Sprachbund 

absence of English and French 

loanwords 38 

internal vs. external explanations 35 
Palauan Japanese 831 
Pama-Nyungan stock 368, 771-2, 773-7 
Papua New Guinea (PNG) English 832 
Papuan languages 

contact between 304 

metatypy in 800-8 

multilingualism in 796-8 
Papuan Pidgin English 822 
parallels, contact vs. coincidence 160-1 
parameter approaches 107-11 
parthenogenesis analogy 53-4 
partially restructured vernaculars 258-9 
passive markers and grammaticalization 

areas 97-8 

pattern replication see convergence 
Penutian macrofamily 369-70 
Peranakans 504 
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perfect phylogeny approach 132-3 
Permic languages 599 
Persia, minority languages in 642 
Persian 
and Arabic 634, 638 
and Khuzistani Arabic, convergence 
74-5 
loanwords in Turkic 661-2, 667 
Peterborough 220 
Philadelphia, ethnicity in 286 
Philippines, Spanish in 552 
phonetic comparison algorithms 138 
phonetic distinctions in borrowing 175-6 
phonetics 
Afro-Spanish 563-5 
network diagram of English accents 
139-40 
phonological characters in basic meaning 
lists 137-8 
phonological metatypy 802-3 
phonological units, transfer from first to 
second language 11 
phonology 
of American languages 675 
in Arabic borrowing 645 
and attrition 330-1 
Irish vs. English 162 
Spanish in South America 561-3 
Turkic languages 668 
pidginization 
and code-switching 197 
and simplification 310 
pidgins 
African languages 704-6 
Arabic 637 
Australian languages 777-9 
Chinese 763-4 
early 253 
of Finno-Ugric languages 604 
formation 254 
Hawaiian 818, 824 
in New Guinea languages 808-11 
in Pacific languages 814-23 
Plantation Pidgin Fijian (PPF) 816-17 
in Siberia 722-5 
South Asian 741-5 
see also creoles 
Pijin 822 
Pipil, contact with Spanish 554 


Pitcairn Island 826-7 
Pitjantjatjara dialect 787-8 
kinship system 334 
pivot matching 71-2 
place names 433-4, 436 
places and regions, production of 213 
Plantation Pidgin Fijian (PPF) 816-17 
Polabian, contact with German 589 
Police Motu 818 
Polish 
in Belarus 593-4 
contact with Czech and German 
589-90 
polysemy copying 91 
populations 
definition 230 
languages as 52 
Portuguese 
code-switching with Spanish 570-3 
contact world-wide 565-6 
creolization 255 
in Guiné-Bissau 257 
pidgins in South Asia 741, 745 
and Tariana (North Arawak language) 
107 
Portuguese creole 
in Chinese 764 
and Tamil 98-9 
positive politeness (PP), and 
code-switching 199 
possessive pronoun for inalienable 
possession 16 
predictors of linguistic change 
linguistic 39-45 
problems with 33 
social 36-9 
pre-pidgins 815-16 
Preposition Stranding 109-10 
English 444, 446 
Prince Edward Island French (PEIF) 
110, 117-19 
prerequisites for contact-induced change 
34-5 
prestige 
and code-switching 197 
and level of power 7-8 
motivation for borrowing 177-8 
and new varieties 236 
previous studies of language contact 2-7 
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Primary Linguistic Data (PLD) 108-9 replica grammaticalization 90-1, 94 
Prince Edward Island French (PEIF) replication of patterns see convergence 
117-19 residual zones 303 
principle of economy 156 resistance to borrowing 178 
process of transfer 156-7 restructuring, grammatical 12 
pro-drop see null subject restructuring of target language to match 
progressive, English 438-41 outset 155-6 
pronominal systems and attrition 329-30 resumptive pronouns in sub-Saharan 
pronouns, sub-Saharan English 531-2 English 527-9 
propelling forces in grammaticalization Rhaeto-Romance, passive markers in 
94-5 97-8 
prosody rhoticity 
in Indo-European 395-7 in American English 457-9 
in Irish and Irish English 158-60 and divergence of dialects 217-18 
Spanish in South America 560-1 Ritharngu 772, 775-6 
Proto-Finnic-Saamic 603 and Ngandi, structural borrowing 179 
Proto-Germanic 406, 414-15 Romance 
punctuational bursts of language change and Arabic, coexistence 641-2 
132 two copulas in 393-4 
Punjabi vigesimality in 385-6 
and English, code-switching 195, 196, Romance language hypothesis 440 
198 Romani (Balkans) 81-2 
and Hindi, code-switching 191 and bilingualism 80 
convergence 73-4 
Quechua Roper River see Kriol 
and Aymara 135-6, 137, 371-2 Rumantsch Grischun, as a koine 58-9 
and Spanish 183, 553, 555-6 Russenorsk (pidgin) 423 
Quechumaran macrofamily 371-2 Russian 
Queensland Canefields English 820 and Aleut 184 
Quinault English 462 vs. Church Slavonic 586-7 
contact with Belarusian 591, 593-4 
Rabaul Creole German 827-8 contact with Finno-Ugric 583-5, 600, 
Raggasonic (band) 193-4 604-5, 607, 
reactions to innovations 72 contact with Ukrainian 591-3 
reallocation 214, 215, 235 loanwords in Turkic 662-3, 667 
in the Fens 224 settlers in Siberia 718-21 
redundancy loss 308 Ryukyu Kingdom 59 
reduplication in Singlish 510-12 
regional dialect leveling see Saami languages 606-7 
supralocalization Saamic, contact with Finnic 603 
regions and changing mobility 213 Sakha (Yakut), Evenki influence on 
regularization of grammar 17-18 728-9, 730 
relatedness see genealogical relatedness Salish and English 333 
relative clause structures, English 443-9 Salishan languages 175, 685 
relexification 183 Samburu (Nilotic language) 134 
and code-switching 191 Samoa, Melanesian Pidgin in 820 
as a scenario in contact-induced change Samoyed languages 599 
273-4 Sardinian and Italian, code-switching 


relocation diffusion 231 191-2 
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Scandinavian 
influence on English 436-7, 447 
simplification 306, 308-9 
scenario approach to contact-induced 
change 271-8 
Sea Islands, USA, dialect convergence in 
287 
second language acquisition 
adult 11 
critical age 310-11 
interference in 170 
neglect of distinctions 161-2 
scenario in contact-induced change 273 
and simplification 310-14 
sub-Saharan English 525-7 
see also language acquisition 
sedentarist metaphysics 208 
selection and inhibition mechanism 
80, 81 
semi-creoles 258-9 
Sepik region, pidgins in 809 
Serbo-Croatian, interdialectal interference 
40 
Settler English 705 
sexual analogy in language contact 55 
shift see language shift 
shift-induced interference 37 
and borrowing scale 41 
Siane speakers, bilingualism 796-7 
Siberia 
contact in 603 
contact with Turkic languages 655-6 
geographical spread 714-15, 717 
history 718-19 
languages in 716 
pidgins in 722-5 
Russian influence 719-21 
Sierra Leone, Krio in 705 
Silesian 590 
similarity in unrelated languages 142-3 
simplification 214 
of Bhojpuri dialect 233-4 
contact-induced 306-9, 312-14 
and second language acquisition 
310-14 
simulations, development of 144 
Singapore Colloquial English (SCE) 764 
single word switching vs. intrasentential 
switching 195 


Singlish 499 
formation 505 
grammatical structure 506-12 
Sinhala 753 
Sinitic language family 513, 757 
comparison with Singlish 506-12 
see also Chinese 
skewing to measure borrowing 134 
slavery in the Caribbean 480-1, 483, 486 
Slavic languages 
contact between 591-4 
contact in Europe 581-2 
contact with Germanic 583 
contact with Iranian 582 
grammaticalization 91-2, 94 
history 585-7 
loanwords in Turkic 663 
Slovak, contact with Czech 590-1 
Sm/algyax 690 
Smith Island, Maryland, USA, divergence 
218 
social class and dialects 283 
social dominance vs. linguistic dominance 
171 
social evaluation of innovations 211 
social factors in new dialect formation 
237-8 
social networks 38 
social predictors of linguistic change 36-9 
sociolinguistic perspectives, previous 
studies 5-6 
sociolinguistics 
and grammatical replication 100-1 
vs. historical linguistics 46 
Solomon Islands, code-switching and 
prestige of pidgin 197 
Sorbian, Upper and Lower 588-9 
South Africa 
languages 706 
multi-ethnicity in 295 
urban varieties in 707 
South African Bhojpuri (SB) 233-4 
South America, macrofamilies 371-2 
South Asian languages 
grammaticalization 745-53 
influence from Dravidian family 740 
pidgins and creoles 741-5 
South Seas Jargon 819, 824 
Southeast Asia, history of contact 502-5 


Southern Vowel Shift in Memphis 290 
Spanish 
in America 553-6, 559 
Argentinian 190-1, 559-63 
and Central American languages 79, 80 
code-switching with Portuguese 570-3 
and Quechua in Media Lengua 183 
in the United States 556-9 
worldwide contact 550-2 
speciation and language contact 61-2 
speech communities, populations as 230 
speech errors 
and bilingualism 80 
as innovations 33 
spelling form hypothesis, sub-Saharan 
English 525 
spontaneous replication in bilingualism 88 
see also speech errors 
Sprachbund 
and additive complexity 305, 314 
of Balkan languages 620-1, 628-9 
definition 620 
Pacific Northwest, internal vs. external 
explanations 35 
spread zones 303 
Sranan 255 
Sri Lanka Malay 743-4 
Sri Lanka Portuguese 743-4 
Srivijaya 502-3 
SSE see sub-Saharan English 
stability of linguistic properties 270 
stabilization of code-switching varieties 
193-4 
stable grammatical features 141 
Stammbaum model see family tree model 
Standard Average European 300 
stative forms, sub-Saharan English 530-1 
status, and level of power 7-8 
stigma and language death 323 
St-Louis 828-9 
stock, definition 361 
Strasbourg Oaths (842) 419 
stress in Indo-European 395-7 
stress patterns see prosody 
structural borrowing 
constraints 184 
situations for 180 
structural change and code-switching 190 
structural transfer 179 
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sub-Saharan English (SSE) 519 
vs. English 520-1 
spelling 525-7 
syntax 527-34 
vowels 521-5 
see also Bantu languages 
substance linguemes 177 
substitution 173 
see also borrowing 
substrate and superstrate languages 7-8 
substrate effects 57, 455-6 
in Caribbean creolization 485-6, 488 
sudden death of languages 322-4 
superposition of Germanic 422 
supportive transfer, definition 19 
supralocalization 213-14, 231 
supraregionalization 213, 231, 239 
Survey of Anglo-Welsh Dialects (SAWD) 447 
Survey of English Dialects (SED) 217 
Surzhyk (hybrid sociolect) 592-3 
Swadesh list 130, 362 
for Australian languages 774 
and borrowing 131-2 
and rate of change 135 
Swahili 
adaptation of loanwords 174 
and English, code-switching 182 
Swedish, contact-induced simplification 
306-7 
switch box analogy of language 
acquisition 108 
switch reference in Australian languages 
776 
switching see code-switching 
symmetric contact languages 277-8 
syntax 
and attrition 331-2 
of Celtic languages 543-5 
sub-Saharan English 527-34 
synthetic constructions and 
grammaticalization 96 
systemic vs. nonsystemic elements 11 


tabula rasa conditions, of new variety 
formation 233 

Tahitian on Pitcairn Island 826 

Tai languages 759-60 

Taimyr Pidgin Russian 722-3 

Tajik and Uzbek 655 


862 Subject Index 


Takia, metatypy in 800, 805, 807 
Tamil 747-8, 750-3 
and Malay/Portuguese creoles 98-9 
Tariana (North Arawak language) 
code-switching in 194 
and East Tucanoan 175-6 
and obligatory multilingualism 303-4 
and Portuguese 107 
Tayo 829 
Temne language 703 
tense in Singlish 509-10 
terminal speakers 325 
Tiwi 785 
Tlingit language of Alaska 677-9 
Tofa, lexical attrition 326-7 
Tok Pisin 254, 818, 821-2, 832 
and English, code-switching and 
pidginization 197 
Tolai, metatypy in 804-5 
tone systems in Chinese 760 
topic prominence in Singlish 508 
Torres Strait Creole 781 
trade language 342 
in New Guinea languages 809-11 
Transcaucasia, contact with Turkic 
languages 657 
transfer 
definition 18-19 
motivations 14-15 
process 156-7 
structural 179 
types 170-1 
transitivity, in South Asian languages 751 
translation as a fieldwork method 353 
transmission chains along genetic lines 50 
Trans-New Guinea (TNG) macrofamily 
366-7 
transparency of morphemes and 
borrowing 179 
Transparency Principle 312 
Trasjanka 594 
trees vs. networks 133-4 
tri-ethnic settings 293-6 
Tsimshianic languages 690 
Tsotsitaal 707-8 
Tungusic language family 715, 731 
Turkic languages 
in Anatolia 658-9 
in the Balkans 659 


in the Caucasus 657-8 
in Central Asia 655 
contact history 653-4 
loanwords in 660-6, 667-8 
in northwestern Europe 659 
in Siberia 655-6 
syntactic borrowing from 666, 668-9 
in Transcaucasia and Iran 657 
in the Volga-Kama region 656—7 
written 660-1 
Turkish 
and Fertek Greek, convergence 75-6 
and Greek 181, 190 
Macedonian, convergence 73-4 
Twana (Salishan language) 175 
typological classification vs. genetic 
classification 49 
typological distance 
as barrier to contact-induced change 41 
as predictor of contact-induced change 
39-43 
typological similarity in unrelated 
languages 142-3 
typology, previous studies in 3 


Ugric languages 599 
Ukrainian 
contact with Belarusian 591 
contact with Russian 591-3 
Umboi Island, borrowing on 798-9 
unbound reflexive in Irish English 158 
unidirectionality 
and grammaticalization 69, 75, 99 
of shift 192 
unitary organism model of languages 52, 
54 
United States 
diffusion in 211-12 
Spanish in 556-9 
Universal Grammar (UG) 108 
universal markedness, as linguistic 
predictor of contact-induced change 
43-4 
universal mechanisms of change and 
propelling forces of 
grammaticalization 94-5 
universalist theories of creolization 484, 
488 
universality of language contact 128 


universals 

of borrowing 83 

in grammaticalization 89-90, 99 
Unserdeutsch 827-8 
Upper Sorbian and German 92-3 
Uralic 

loanwords in Turkic 664 

relationship to Yukaghir 606 
urban environments, previous studies 6 
urban varieties of African languages 

706-9 

Urdu 744-5 
Uruguay 

Portuguese in 566-8, 573 

Spanish in 559-63, 573 
Usarufa and multilingualism 796 
use patterns in grammaticalization 89 
Utian 369-70 
utterance particles in Singlish 512 
Uzbek, influence on Tajik 655 
Uzbekistan, Arabic in 648 


Vandals, migration 417 
Vanuatu 
and Bislama 89-90 
code-switching and prestige of pidgin 

197 

variables in linguistic data 348-9 

Vasconic family 408-9 

Vendryes’ Restriction, constraint 543-4 

verb second (V2) parameter in Early 
Modern Irish English (EMIE) 115 

vernacular reorganization 232 

vernacular universals, previous studies 5 

Vietnamese, influence from Chinese 765 

vigesimality in Indo-European languages 
382-8 

Viking invasions, influences on English 
436-7 

vocabulary lists see Swadesh list 

Volga-Kama region, contact with Turkic 
languages 656-7 

vowel shift in London English 246-7 

vowel system in sub-Saharan English 
521-5 


Wai (Fiji koine) 830 
Wakashan languages 679-80, 681, 684-5, 
689-90 


Subject Index 863 


Wappo 679 
language death 323 
Warlpiri, attrition 329-30 
Warndarang 772 
Washo 370 
Waskia and Takia, metatypy between 
800 
Watam 
borrowing in 799 
and Manam, metatypy between 800-1 
metatypy in 805, 806-8 
we-code 201 
Welsh influence on English relative 
clauses 445-8 
wh-movement 109-10 
Prince Edward Island French (PEIF) 
117-19 
word order 
and attrition 332 
Irish and Irish English 157, 161 
World Atlas of Language Structures (WALS) 
142 


Yahi, sudden language death 322-3 
Yana 682-3 
Yelogu 796 
Yeneseic languages of Siberia 678 
Yiddish 422-3 
and English 37 
influence on American English 462-3 
and Judeo-Sorbian 56 
and Modern Hebrew 56, 57 
syntactic attrition 331 
Yimas 
borrowing in 799 
metatypy in 801-2, 803-4 
Pidgin 819 
Yimas village, trade pidgins in 809-11 
Yirrkala community, Dhuwaya 787 
Yod Dropping 226 
Yolngu languages 772 
Young People’s Dyirbal (YPD) 784 
see also Dyirbal 
Yukaghir family 715, 721, 730 
relationship to Uralic 606 
Yuki 679 
language death 323 
Yumpla Tok 781 
Yurok 683 


